367

I am playing around with MongoDB trying to figure out how to do a simple

SELECT province, COUNT(*) FROM contest GROUP BY province

But I can't seem to figure it out using the aggregate function. I can do it using some really weird group syntax

db.user.group({
    "key": {
        "province": true
    },
    "initial": {
        "count": 0
    },
    "reduce": function(obj, prev) {
        if (true != null) if (true instanceof Array) prev.count += true.length;
        else prev.count++;
    }
});

But is there an easier/faster way using the aggregate function?

0

9 Answers 9

582

This would be the easier way to do it using aggregate:

db.contest.aggregate([
    {"$group" : {_id:"$province", count:{$sum:1}}}
])
Sign up to request clarification or add additional context in comments.

6 Comments

I get an error message when I try that "errmsg" : "exception: A pipeline stage specification object must contain exactly one field.", ?
how do you group sort to it? I want to sort count by -1
@FilipBartuzi there is an example in the documentation page, you'll have to add a sort operation to the pipeline, as { $sort: { count: -1 } }
I got the same exception as @Steven and it was because I copy-pasted just line 2 and omitted the surrounding square brackets.
@Steven you will have to execute it properly use something like this .aggregate(aggregate).exec();
|
173

I need some extra operation based on the result of aggregate function. Finally I've found some solution for aggregate function and the operation based on the result in MongoDB. I've a collection Request with field request, source, status, requestDate.

Single Field Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:"$source", count:{$sum:1}}}
])

Multiple Fields Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}}
])

Multiple Fields Group By & Count with Sort using Field:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"_id.source":1}}
])

Multiple Fields Group By & Count with Sort using Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"count":-1}}
])

2 Comments

can you explain {_id:{source:"$source",status:"$status"} this?
Basically, the field "_id" is a unique identifier for each document. The field accepts an expression. You can define the value of the field by combining multiple fields based on your grouping criteria. You will find more details about the field in the link: docs.mongodb.com/manual/reference/operator/aggregation/group/…
76

If you need multiple columns to group by, follow this model. Here I am conducting a count by status and type:

  db.BusinessProcess.aggregate({
    "$group": {
        _id: {
            status: "$status",
            type: "$type"
        },
        count: {
            $sum: 1
        }
    }
   })

2 Comments

_id represents a default param for encapsulating multiple fields?
@RoyiNamir, please take a look at the link. You may find your information there. docs.mongodb.com/manual/reference/operator/aggregation/group/…
59

Starting in MongoDB 3.4, you can use the $sortByCount aggregation.

Groups incoming documents based on the value of a specified expression, then computes the count of documents in each distinct group.

https://docs.mongodb.com/manual/reference/operator/aggregation/sortByCount/

For example:

db.contest.aggregate([
    { $sortByCount: "$province" }
]);

1 Comment

Probably worth noting here that $sortByCount is actually a "pseudo operator" like several more aggregation stage operators introduced from MongoDB 3.4. All they really do is expand into their respective aggregation stages. In this case a $group with $sum: 1 as shown in existing answers and an additional $sort stage. They offer no advantage other than "typing less code", which may or may not be more descriptive ( if you're into that sort of thing ). IMHO, distinct $group and $sort stages in the code are far more descriptive and indeed more flexible.
29

Additionally if you need to restrict the grouping you can use:

db.events.aggregate( 
    {$match: {province: "ON"}},
    {$group: {_id: "$date", number: {$sum: 1}}}  
)

Comments

12

This type of query worked for me:

 db.events.aggregate({$group: {_id : "$date", number:  { $sum : 1} }} )

See http://docs.mongodb.org/manual/tutorial/aggregation-with-user-preference-data/

Comments

8

Starting in Mongo 5.0, we can also use { $count: { } } as an alias for { $sum : 1 }:

// { "province" : "Champagne-Ardenne" }
// { "province" : "Champagne-Ardenne" }
// { "province" : "Haute-Normandie"   }
db.collection.aggregate([
  { $group: { _id: "$province", count: { $count: {} } } }
])
// { "_id" : "Champagne-Ardenne", "count" : 2 }
// { "_id" : "Haute-Normandie",   "count" : 1 }

Comments

6
    db.contest.aggregate([
        { $match:{.....May be some match criteria...}},
        { $project: {"province":1,_id:0}},
        { $sortByCount: "$province" }
    ],{allowDiskUse:true});

MongoDB have 32 MB limitation of sorting operation on memory, use allowDiskUse : true this option, when you expose this query upfront of millions of data, it will sort at disk level not in memory. MongoDB aggregation pipeline has 100MB limitation, so use $project to reduce the data flowing to next pipeline. If you are using small data then no need to use allowDiskUse option.

Comments

-1

Mongo shell command that worked for me:

db.getCollection(<collection_name>).aggregate([{"$match": {'<key>': '<value to match>'}}, {"$group": {'_id': {'<group_by_attribute>': "$group_by_attribute"}}}])

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.