Mongodb parallel queries via nodejs

Question

I am trying to run a number of mongodb queries via node Async. But they are still taking time to run.. The database is indexed and completely optimised. Is there a way by which I can increase the query speed time via mongodb admin ... or like increase its performance by allocating more memory to it.

The queries are running one by one when I see on the console. and some are taking too long ... resulting in no response..

2015-12-29T10:31:48.958-0800 I COMMAND  [conn63] command consumers.$cmd command: count { count: "consumer1s", query: { ZIP: 37089, $or: [ { ADULTS_F_18_24: "Y" }, { ADULTS_F_24_35: "Y" } ] } } planSummary: IXSCAN { ZIP: 1.0, GENDER: 1.0 } keyUpdates:0 writeConflicts:0 numYields:1 reslen:44 locks:{ Global: { acquireCount: { r: 4 } }, MMAPV1Journal: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 } }, Collection: { acquireCount: { R: 2 }, acquireWaitCount: { R: 2 }, timeAcquiringMicros: { R: 54270 } } } 146ms

2015-12-29T10:31:54.925-0800 I COMMAND  [conn62] command consumers.$cmd command: count { count: "consumer1s", query: { ZIP: 37024, $or: [ { ADULTS_F_18_24: "Y" }, { ADULTS_F_24_35: "Y" } ] } } planSummary: IXSCAN { ZIP: 1.0, GENDER: 1.0 } keyUpdates:0 writeConflicts:0 numYields:88 reslen:44 locks:{ Global: { acquireCount: { r: 178 } }, MMAPV1Journal: { acquireCount: { r: 172 } }, Database: { acquireCount: { r: 89 } }, Collection: { acquireCount: { R: 89 }, acquireWaitCount: { R: 83 }, timeAcquiringMicros: { R: 1654781 } } } 6114ms

Hi please see the logs to understand my question ... 2 queries following same plan .. have a large execution time difference ... Please tell me the reason and how to fix it

Following info must be handy. I am working this application on a Macintosh System. OSX Yosemite 10.10.2 Processor 3.2Ghz Intel Core i5. Memory is 8GB 1600MHz DDR3. Any suggestions how I can allocate more virtual memory to the mongodb

You need to look at your logs and monitor resource usage during the queries. There is no silver bullit to improve query performance — Martin
– Martin, Commented Dec 28, 2015 at 12:19
It depends on your real environment . It's high concurrency or big data ? For example , if it's high concurrency , you'd better use redis or memcached or other solutions. While,I'm also looking for the mongodb optimising itself ... — hirra
– hirra, Commented Jan 20, 2016 at 5:41
What is ADULTS_F_24_35? Seems it should be a boolean type, and I'm not sure that it falls under the GENDER index, but I could be wrong. Could you provide the document structure and the indexes you're using? — tsturzl
– tsturzl, Commented Jan 21, 2016 at 18:40
It also just seems to be really oddly structured. You seem to cram a lot of context into a single field maybe have field for: age/age range, or make it a boolean and then have an entirely different field for gender. — tsturzl
– tsturzl, Commented Jan 21, 2016 at 19:00

ZachB · Accepted Answer · 2015-12-29 00:55:30Z

1

As @Martin said, you need to profile. Use something like cursor.explain to make sure the query is using indexes and to find weak points. Use whatever resource monitor your system has (like top/htop on linux) to see if it's running out of memory or if it's CPU-bound.

"The queries are running one by one" -- I assume you're not using async.series or similar, which is sequential.

answered Dec 29, 2015 at 0:55

ZachB

15.8k5 gold badges67 silver badges93 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

tsturzl Over a year ago

Queries are never CPU bound, and I mean never. Aggregations, and huge inserts are usually the only thing that are CPU intensive. I've done a lot of performance profiling of queries, and queries do next to nothing to the CPU unless there is a huge quantity of them. There is basically no operator for regular queries that is very complicated, except for geospatial and text-search. Really only aggregations are going to consume much CPU, and still usually on the lower end.

tsturzl Over a year ago

Running in a series or parallel shouldn't make a noticable difference, anything beyond 200ms for a query is concerning for a basic query. The amount of time saved by running in parallel would be pretty difficult for a human to discern. Given its probably better in production, assuming you're not running 50 queries in parallel, that could become a point for DoS in that case. I usually run in series when I'm processing an array of an unknown size, otherwise parallel if I know the size won't be beyond a certain amount. They both serve a purpose.

ZachB Over a year ago

All good points @tsturzl. The edited question has more useful info -- a high numYields and high timeAcquiring seem like they might be slow disk, swapping/paging and/or concurrency issues?

tsturzl Over a year ago

Look at the quoted output, I don't think the GENDER index is acting like it should due to the strange datastructure.

ZachB Over a year ago

Oh right, missed that, good point. (I'd post that as a possible answer :))

Collectives™ on Stack Overflow

Mongodb parallel queries via nodejs

1 Answer 1

5 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Related