1

I have some trade data like this

{
    "_id" : 1498290900.0,
    "trade" : {
        "type" : "Modify",
        "data" : {
            "type" : "bid",
            "rate" : "0.00658714",
            "amount" : "3.82354427"
        },
        "date" : 1498290930291.0,
        "name" : "TLX"
    }
},{
    "_id" : 1498290900.0,
    "trade" : {
        "type" : "Modify",
        "data" : {
            "type" : "ask",
            "rate" : "0.00658714",
            "amount" : "3.82354427"
        },
        "date" : 1498290930291.0,
        "name" : "TLX"
    }
},{
    "_id" : 1498290900.0,
    "trade" : {
        "type" : "Remove",
        "data" : {
            "type" : "ask",
            "rate" : "0.00680891"
        },
        "date" : 1498290931349.0,
        "name" : "TLX"
     }
}

These come from $rewind hence the _id being same. Want i want to do next is group them on _id so i try

{ 
    $group: { 
      _id: {_id: "$_id", name: "$trade.type",dtype: "$trade.data.type"},
        count          : {$sum: 1}
    },
  },{$project: { _id: "$_id._id", type: "$_id.name", count: 1, dtype: "$_id.dtype" } },
  {

      $group: {
          _id: "$_id",
          results: { $push : "$$ROOT" }
          }
  }

Which is quite good, give me the below

 {
    "_id" : 1498276800.0,
    "results" : [ 
        {
            "count" : 16.0,
            "_id" : 1498276800.0,
            "type" : "Modify",
            "dtype" : "bid"
        }, 
        {
            "count" : 15.0,
            "_id" : 1498276800.0,
            "type" : "Remove",
            "dtype" : "bid"
        }, 
        {
            "count" : 3.0,
            "_id" : 1498276800.0,
            "type" : "Remove",
            "dtype" : "ask"
        }, 
        {
            "count" : 1.0,
            "_id" : 1498276800.0,
            "type" : "Modify",
            "dtype" : "ask"
        }
    ]
}

But i was trying to make the output more like this

  {
    "_id" : 1498276800.0,
    "Modify": {
      "bid":{
        "count": 16.0
      },
      "ask": {
        "count": 1.0
      }
    },
    "Remove": {
      "bid":{
        "count": 15.0
      },
      "ask": {
        "count": 3.0
      }
    }
  }

But no amount of playing with $projections has got me close.

Can anyone point me in the right direction please?

thanks.

UPDATE

Excluding last pipeline stage, this is example documents with nice bid/ask per type ready to be grouped by _id.

{
    "_id" : {
        "_id" : 1498276800.0,
        "type" : "orderBookRemove"
    },
    "results" : [ 
        {
            "k" : "bid",
            "v" : {
                "count" : 15.0
            }
        }, 
        {
            "k" : "ask",
            "v" : {
                "count" : 3.0
            }
        }
    ]
},
{
    "_id" : {
        "_id" : 1498276800.0,
        "type" : "orderBookModify"
    },
    "results" : [ 
        {
            "k" : "bid",
            "v" : {
                "count" : 16.0
            }
        }, 
        {
            "k" : "ask",
            "v" : {
                "count" : 1.0
            }
        }
    ]
}

When last part of pipeline is applied i.e

{ "$group": {
    "_id": "$_id._id",
    "results": {
      "$push": {
        "k": "$_id.type",
        "v": "$results"
      }
    }
  }}

I get this, only first 'bid' element of results array. 2nd item 'ask' goes AWOL ?

{
    "_id" : 1498280700.0,
    "results" : [ 
        {
            "k" : "orderBookRemove",
            "v" : [ 
                {
                    "k" : "bid",
                    "v" : {
                        "count" : 9.0
                    }
                }
            ]
        }, 
        {
            "k" : "orderBookModify",
            "v" : [ 
                {
                    "k" : "bid",
                    "v" : {
                        "count" : 6.0
                    }
                }
            ]
        }
    ]
}

1 Answer 1

1

This entirely depends on the MongoDB version you have available, or not really, depending on how you look at it. As you say the data is actually from an array originally so I am going to start in that format, and process each option from there.

The source being considered then being:

{
        "_id" : ObjectId("594f3a530320738061df3eea"),
        "data" : [
                {
                        "_id" : 1498290900,
                        "trade" : {
                                "type" : "Modify",
                                "data" : {
                                        "type" : "bid",
                                        "rate" : "0.00658714",
                                        "amount" : "3.82354427"
                                },
                                "date" : 1498290930291,
                                "name" : "TLX"
                        }
                },
                {
                        "_id" : 1498290900,
                        "trade" : {
                                "type" : "Modify",
                                "data" : {
                                        "type" : "ask",
                                        "rate" : "0.00658714",
                                        "amount" : "3.82354427"
                                },
                                "date" : 1498290930291,
                                "name" : "TLX"
                        }
                },
                {
                        "_id" : 1498290900,
                        "trade" : {
                                "type" : "Remove",
                                "data" : {
                                        "type" : "ask",
                                        "rate" : "0.00680891"
                                },
                                "date" : 1498290931349,
                                "name" : "TLX"
                        }
                }
        ]
}

MongoDB 3.4

Simply use $replaceRoot and $arrayToObject with some careful placement of results:

db.dtest.aggregate([
  { "$unwind": "$data" },
  { "$group": { 
   "_id": {
     "_id": "$data._id", 
     "type": "$data.trade.type",
     "dtype": "$data.trade.data.type"
   },
   "count": { "$sum": 1 }
  }},
  { "$group": {
    "_id": {
      "_id": "$_id._id",
      "type": "$_id.type"
    },
    "results": {
      "$push": {
        "k": "$_id.dtype",
        "v": {
          "count": "$count"
        }
      }         
    }
  }},
  { "$group": {
    "_id": "$_id._id",
    "results": {
      "$push": {
        "k": "$_id.type",
        "v": "$results"
      }
    }
  }},
  { "$replaceRoot": {
    "newRoot": {
      "$arrayToObject": {
        "$concatArrays": [
          [{ "k": "_id", "v": "$_id" }],
          { "$map": {
            "input": "$results",
            "as": "r",
            "in": {
              "k": "$$r.k",
              "v": { "$arrayToObject": "$$r.v" }
            }
          }}
        ]
      }
    }
  }}
])

All versions

Which is what probably makes the most sense in most cases where we simply do the aggregated array form and do the transformation in the client instead. We don't really need additional aggregation since that part is already done, so it's not like we are reducing the data any further.

Simple to do in most languages, but as a basic JavaScript concept that works in the shell:

db.dtest.aggregate([
  { "$unwind": "$data" },
  { "$group": { 
   "_id": {
     "_id": "$data._id", 
     "type": "$data.trade.type",
     "dtype": "$data.trade.data.type"
   },
   "count": { "$sum": 1 }
  }},
  { "$group": {
    "_id": {
      "_id": "$_id._id",
      "type": "$_id.type"
    },
    "results": {
      "$push": {
        "k": "$_id.dtype",
        "v": {
          "count": "$count"
        }
      }         
    }
  }},
  { "$group": {
    "_id": "$_id._id",
    "results": {
      "$push": {
        "k": "$_id.type",
        "v": "$results"
      }
    }
  }}
]).map(doc => 
  doc.results.map(r => 
    ({ k: r.k, v: r.v.reduce((acc,curr) =>
      Object.assign(acc, { [curr.k]: curr.v }),{}) 
    })
  ).reduce((acc,curr) => 
    Object.assign(acc, { [curr.k]: curr.v }),{ _id: doc._id })
)

Which is essentially doing the same thing whilst processing the cursor as out new fancy pipeline stage is doing for each document.

So it really just goes to show that unless you intend to aggregate the results further along than from even this result, it simply is not necessary to use the fancy new operators. The same thing is achieved in what is considerably less lines of code and is far less terse to express.

Both output the same thing:

    {
            "_id" : 1498290900,
            "Modify" : {
                    "ask" : {
                            "count" : 1
                    },
                    "bid" : {
                            "count" : 1
                    }
            },
            "Remove" : {
                    "ask" : {
                            "count" : 1
                    }
            }
    }

Debug - to remove

Taking the data from your update I apply this:

db.test.aggregate([
  { "$group": {
    "_id": "$_id._id",
    "results": {
      "$push": {
        "k": "$_id.type",
        "v": "$results"
      }
    }
  }},
  { "$replaceRoot": {
    "newRoot": {
      "$arrayToObject": {
        "$concatArrays": [
          [{ "k": "_id", "v": "$_id" }],
          { "$map": {
            "input": "$results",
            "as": "r",
            "in": {
              "k": "$$r.k",
              "v": { "$arrayToObject": "$$r.v" }
            }
          }}
        ]
      }
    }
  }}
 ])

And get the expected output:

{
    "_id" : 1498276800.0,
    "orderBookRemove" : {
        "bid" : {
            "count" : 15.0
        },
        "ask" : {
            "count" : 3.0
        }
    },
    "orderBookModify" : {
        "bid" : {
            "count" : 16.0
        },
        "ask" : {
            "count" : 1.0
        }
    }
}

So your claimed output is false, and you are not following the example.

Sign up to request clarification or add additional context in comments.

6 Comments

thank you.. i am using 3.4. I tried option 2 but it worked up till last pipeline stage but only carries forward one element of the results array. I have updated question under UPDATE with example of what i see.
@FloWoo You have not taken "all" of the pipeline in the answer. You are missing a stage with the $replaceRoot which is the last stage, or the alternate code approach processing the cursor result. It does work of course because as I clearly put, I'm running of your own data in the question, and I show the output which is the same as what you asked for. Add in the last stage and run again.
@FloWoo Please follow the self enclosed example within the answer. Start with the document I provide and run the full pipeline. You are doing something wrong in your interpretation. Follow the example all the way through first. What you are claiming in your updated output is not actually possible, without you intervening and doing something different to what is listed here. The answer deliberately replaces your existing pipeline from the point where the data was in the array because your own handling from that point was incorrect. I am correcting everything from that source.
@FloWoo Read my debug statement at the end. I'm taking the data you posted and running the same remaining pipeline stages I listed. The output is perfect, just as expected.
thank you sir, you are correct.. the 'problem' was my fault and your answer is brilliant. Thank you.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.