14

I have a collection with the following data:

{
    "_id" : ObjectId("5516d416d0c2323619ddbca8"),
    "date" : "28/02/2015",
    "driver" : "user1",
    "passengers" : [
        {
            "user" : "user2",
            "times" : 2
        },
        {
            "user" : "user3",
            "times" : 3
        }
    ]
}
{
    "_id" : ObjectId("5516d517d0c2323619ddbca9"),
    "date" : "27/02/2015",
    "driver" : "user2",
    "passengers" : [
        {
            "user" : "user1",
            "times" : 2
        },
        {
            "user" : "user3",
            "times" : 2
        }
    ]
}

And I would like to perform aggregation so that I will know for a certain passenger, times it was with a certain driver, in my example it would be: for user1: [{ driver: user2, times: 2}] for user2: [{ driver: user1, times: 2}] for user3: [{ driver: user1, times: 3}, {driver: user2, times:2}]

Im quite new with mongo and know how to perform easy aggregation with sum, but not when its inside arrays, and when my subject is itself in the array. what is the appropriate way to perform this kind of aggregation, and in more specific, how I perform it in express.js based server?

1

1 Answer 1

23

To achieve your needs with aggregation framework, the first pipeline stage will be a $match operation on the passenger in question that matches the documents with the user in the passenger array, followed by the $unwind operation which deconstructs the passengers array from the input documents in the previous operation to output a document for each element. Another $match operation on the deconstructed array follows that further filters the previous document stream to allow only matching documents to pass unmodified into the next pipeline stage, which is projecting the required fields with the $project operator. So essentially your aggregation pipeline for user3 will be like:

db.collection.aggregate([
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$unwind": "$passengers"
     },
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$project": {
             "_id": 0,
            "driver": "$driver",
            "times": "$passengers.times"
        }
     }
])

Result:

/* 0 */
{
    "result" : [ 
        {
            "driver" : "user1",
            "times" : 3
        }, 
        {
            "driver" : "user2",
            "times" : 2
        }
    ],
    "ok" : 1
}

UPDATE:

For grouping duplicates on drivers with different dates, as you mentioned you can do a $group operation just before the last $project pipeline stage where you compute the total passengers times using the $sum operator:

db.collection.aggregate([
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$unwind": "$passengers"
     },
     {
        "$match": {
            "passengers.user": "user3"
        }
     },
     {
         "$group": {
             "_id": "$driver", 
             "total": {
                 "$sum": "$passengers.times"
             }
         }
     },
     {
         "$project": {
            "_id": 0,
            "driver": "$_id",
            "total": 1
        }
     }
])

Result:

/* 0 */
{
    "result" : [ 
        {
            "total" : 2,
            "driver" : "user2"
        }, 
        {
            "total" : 3,
            "driver" : "user1"
        }
    ],
    "ok" : 1
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that was very close to my needs (it didnt group duplicates of drivers from different dates). I ended up using a $group: {"_id": "$driver", "total": {$sum: "$passengers.times"}},

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.