3

There are many documents:

{
        "_id"   : ObjectId("506ddd1900a47d802702a904"),
        "subid" : "s1",
        "total" : "300",
        "details" :[{
                      name:"d1", value: "100"
                    },
                    {
                      name:"d2", value: "200"
                    }]
}
{
        "_id"   : ObjectId("306fff1900a47d802702567"),
        "subid" : "s1",
        "total" : "700",
        "details" : [{
                      name:"d1", value: "300"
                    },
                    {
                      name:"d8", value: "400"
                    }]
 }

Elements in 'details' arrays may vary.

Question is: how can I get such result with aggregation framework and java?

{
        "_id"     : "s1",
        "total"   : "1000",
        "details" : [{
                      name:"d1", value: "400"
                    },
                    {
                      name:"d2", value: "200"
                    },
                    {
                      name:"d8", value: "400"
                    }]
 }

Or maybe I should use custom map-reduce functions here?

1 Answer 1

2

This is very achievable with aggregate, though a little obtuse, but lets run through it:

db.collection.aggregate([

    // First Group to get the *master* total for the documents
    {"$group": {
        "_id": "$subid",
         "total": { "$sum": "$total" },
         details: { "$push": "$details" } 
     }},

     // Unwind the details
     {"$unwind": "$details"},

     // Unwind the details "again" since you *pushed* and array onto an array
     {"$unwind":"$details"},

     // Now sum up the values by each name (keeping levels)
     {"$group": {
         "_id:" {
              "_id": "$_id",
              "total": "$total",
              "name":  "$details.name"
          },
          "value": {"$sum": "$details.value"}
      }},

     // Sort the names (because you expect that!)
     {"$sort": { "_id.name": 1}},

     // Do some initial re-shaping for convenience
     {"$project": {
         "_id": "$_id._id",
         "total": "$_id.total",
         "details": { "name": "$_id.name", "value": "$value" }
     }},

     // Now push everything back into an array form
     {"$group": {
         "_id": {
              "_id": "$_id",
              "total": "$total"
         },
         "details": {"$push": "$details"}
     }},

     // And finally project nicely
     {"$project": {
         "_id": "$_id._id",
         "total": "$_id.total",
         "details": 1 
     }}
])

So if you gave that a try before, you might have missed the concept of doing the initial group to get the top level sum on your total field in your documents.

Admittedly, the tricky bit is "getting your head around" the whole double unwind thing that comes next. Since in that first group we pushed an array into another array, then we now end up with this new nested structure that you need to unwind twice in order to come to a "de-normalized" form.

Once you've done that, you just $group up to the name field:

equiv ( GROUP BY _id, total, "details.name" )

So more or less like that with some sensible re-shaping. Then I ask to sort by the name key (because you printed it that way), and finally we $project into the actual form that you wanted.

So Bingo, we have your result. Thanks for the cool question to show the use of a double unwind.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for such fast answer, Neil! The real problem is the one document have many "total" fields and many various "details" arrays (7 actually), which contains 5-100 elements each. How many unwinds should I use in that case?:) May be should I change the document`s schema and extract those "details" arrays to separate collections?
@arctica That sounds pretty complex and is probably best represented in another question where you can be specific on the details. The approach is roughly the same, but I or someone else can better explain the construction if there is a question there with enough information so we can answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.