-1

I have a users collection with a compound index:

db.users.createIndex({
  bin: 1,
  gender: 1,
  age: 1,
  location: 1,
  // ... other fields
});

When I query like this:

db.users.find({
  bin: X,
  gender: Y,
  age: Z,
  location: L
}).sort({ bin: 1 });

MongoDB is able to use the index to both filter and avoid an in-memory sort, since the equality filters consume the leftmost prefix and the sort on bin is trivial.

But when I change the query to use an $in on bin:

db.users.find({
  bin: { $in: [x, y, z] },
  gender: Y,
  age: Z,
  location: L
}).sort({ bin: 1 });

I notice that the query still uses the index for filtering, but will it perform an in-memory sort for the bin ordering?

Question:

Why can’t MongoDB use the index to satisfy the sort in the $in case, and is there a way to structure the query or index so that the sort on bin can be handled by the index alone?

6
  • 1
    I think the answer here is yes. Run that query using explain with executionStats to get that answer directly. Commented Aug 28 at 17:16
  • 1
    Assuming L, Y, and Z are equality matches try building an index on {gender:1, age:1, location:1, bin:1} and compare performance between them Commented Aug 28 at 17:52
  • 1
    Why do you use an index on "age"? Why do you store it at all? Because age is changing all the time. Commented Aug 28 at 20:02
  • @WernfriedDomscheit it's simplified example. I index year of birth, not age. Commented Aug 28 at 20:11
  • 1
    here is an example of index use for $in match + sort: mongoplayground.net/p/C4KVTtgGbcn Commented Aug 28 at 21:12

1 Answer 1

2

It should not have to do an additional sort as "bin" comes in order:

for (let bin of [1, 2, 3]) {
  for (let gender of ["M", "F"]) {
    for (let age of [20, 30]) {
      for (let loc of ["NY", "LA"]) {
        for (let i = 0; i < 3; i++) {
          db.users.insertOne({ bin, gender, age, location: loc, name: `User_${bin}_${gender}_${age}_${loc}_${i}`
          })
        }
      }
    }
  }
}

db.users.createIndex({
  bin: 1, gender: 1, age: 1, location: 1
})

db.users.find({
  bin: { $in: [1, 2, 3] },
  gender: "M", age: 20, location: "NY"
}).sort({ bin: 1 }).explain("executionStats").executionStats

One IXSCAN with seeks and no SORT:

{
  executionSuccess: true,
  nReturned: 9,
  executionTimeMillis: 1,
  totalKeysExamined: 14,
  totalDocsExamined: 9,
  executionStages: {
    isCached: false,
    stage: 'FETCH',
    nReturned: 9,
    executionTimeMillisEstimate: 0,
    works: 14,
    advanced: 9,
    needTime: 4,
    needYield: 0,
    saveState: 0,
    restoreState: 0,
    isEOF: 1,
    docsExamined: 9,
    alreadyHasObj: 0,
    inputStage: {
      stage: 'IXSCAN',
      nReturned: 9,
      executionTimeMillisEstimate: 0,
      works: 14,
      advanced: 9,
      needTime: 4,
      needYield: 0,
      saveState: 0,
      restoreState: 0,
      isEOF: 1,
      keyPattern: { bin: 1, gender: 1, age: 1, location: 1 },
      indexName: 'bin_1_gender_1_age_1_location_1',
      isMultiKey: false,
      multiKeyPaths: { bin: [], gender: [], age: [], location: [] },
      isUnique: false,
      isSparse: false,
      isPartial: false,
      indexVersion: 2,
      direction: 'forward',
      indexBounds: {
        bin: [ '[1, 1]', '[2, 2]', '[3, 3]' ],
        gender: [ '["M", "M"]' ],
        age: [ '[20, 20]' ],
        location: [ '["NY", "NY"]' ]
      },
      keysExamined: 14,
      seeks: 5,
      dupsTested: 0,
      dupsDropped: 0
    }
  }
}

If you have something different, please share your execution plan


Follow up on this question:

does the lack of a SORT stage in the explain output indicate that the sorting was non-blocking, via the index?

To prove that there's no blocking sort (= read all entries and sort them before being consumed by upper stage) you can add a limit and verify that it reads only the necessary keys. For example, this:

db.users.find({
  bin: { $in: [1, 2, 3] },
  gender: "M", age: 20, location: "NY"
}).sort({ bin: 1 }).limit(3).explain("executionStats").executionStats

will show:

  nReturned: 3,
  totalKeysExamined: 3,
  totalDocsExamined: 3,
Sign up to request clarification or add additional context in comments.

2 Comments

does the lack of a SORT stage in the explain output indicate that the sorting was non-blocking, via the index?
Yes, and one way to prove it, add a .limit() to the query and see that, still without sort, it doesn't examine more keys than necessary. Adding an example to the answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.