0

I want to build a query for a very dynamic collection.

An example:

I have a collection like

{
  _id: ObjectId(),
  value: x
  // some other data
}

The example dataset has the values

{
  value: 1
},
{
  value: 1
},
{
  value: 2
},
{
  value: 3
},
{
  value: 3
}

As you can see the same value can be there multiple times.

But if I run the following query it only returns the first with value: 3

db.collection.aggregate([
  {
    $sort: "$value"
  },
  {
    $limit: 4
  }
])

But what I want is at least 4 documents which include all occurrences of the values in them. So I want all where value: 3.

Edit

Sorry, the question might be a bit misleading. I want to have a complete result. So all with value: 3. It is for a public transport database and the value is the departure time. So I want at least the next 30 departures, but if 30 and 31 depart at the same time, I want the 31 also.

2 Answers 2

1

I now use a small python function which extends the limit as I want. Since the query returns a cursor I do not waste resources. I do not specify a limit in the query.

def extend_limit(cursor, original_limit):
    result = []

    try:
        while original_limit > 0:
            result.append(cursor.next())
            original_limit -= 1

        last_element = result[-1]

        while True:
            next_element = next(cursor)

            if last_element['value'] != next_element['value']:
                break

            result.append(next_element)

    except StopIteration:
        pass

    return result

Thanks to Adam Comerford

Sign up to request clarification or add additional context in comments.

Comments

0

There is no need to use aggregation here, just do a normal find with a projection, a sort and a limit:

db.collection.find({}, {_id : 0, value : 1}).sort({value : 1}).limit(4)

I'd recommend that you actually query on some criteria (rather than empty in my example) and that the criteria have an appropriate index that includes the sorted field if possible (for performance reasons).

2 Comments

Sorry, the question might be a bit misleading. I want to have a complete result. So all with value: 3. It is for a public transport database and the value is the departure time. So I want at least the next 30 departures, but if 30 and 31 depart at the same time, I want the 31 also.
Then, just don't specify the limit, you will get all results (at least up to the total that fit in a batch), you just adjust the criteria for a given query to exclude results before/after a certain point in time

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.