13

I have a collection called "Lines" with the following structure (basically, I have a lot of documents that contain several arrays and I need to count their elements with conditions).

    {
        "_id" : "201503110040020021",
        "Line" : "1", // several documents may have this Line value
        "LineStart" : ISODate("2015-03-11T06:49:35.000Z"),
        "SSCEXPEND" : [ 
            {
                "Secuence" : 10,
                "Title" : 1,
            }, 
            {
                "Secuence" : 183,
                "Title" : 613,
            }, 
            ...
        ],
        "SSCCANCELATIONS" : [ 
            {
                "Secuence" : 34,
                "Title" : 113,
            }, 
            {
                "Secuence" : 96,
                "Title" : 2,
            }, 
            ... 
        ],
        "SSCVALIDATIONS" : [ 
            {
                "Secuence" : 12,
                "Result" : 1
            }, 
            {
                "Secuence" : 15,
                "Result" : 1,
            },
            {
                "Secuence" : 18,
                "Result" : 20,
            },
            ...
        ]
    },
    ...

What I need is to count how many elements in those arrays match certain conditions, for instance, I want to count every element in SSCCANCELATIONS, but I only want to count SSCEXPEND elements with Title = 1, and SSCVALIDATIONS elements with Result < 10

I can get the total number of elements of every array, with

db.Lines.aggregate( { $project: { Line : 1, Validations: { $size: "$SSCVALIDATIONS" }, ... } } ) 

But I need to stablish conditions, to get something like:

    {
        "_id" : "201503110040020021",
        "Line" : "1",
        "LineStart" : ISODate("2015-03-11T06:49:35.000Z"),
        "SSCEXPEND" : 15,
        "SSCCANCELATIONS" : 10,
        "SSCVALIDATIONS" : 462
    },

In the end, I will need to group the result for Line and LineStart, but I think I already have everything else (I get the date substracting hours, minutes,... from the dates I have).

So the only thing I need to know is how to count only the array elements I really want.

I have read about db.collection.group()

but I found the db.collection.group() method does not work with sharded clusters, so I can't use it.

I have also read this old question: MongoDB: Count of matching nested array elements which is more or less the same, but it was answered almost five years ago and at the time, the answer was that there's no direct way to do it, so I am asking in case there is a way now.

1

2 Answers 2

13

Method with unwind take a lot of resources. Use directly a project when you can

db.c.aggregate([
{
    $project: {
         _id: 1,
         Line: 1,
         LineStart:1,

         SSCEXPEND: {
            $size: {
                $filter: {
                   input: "$SSCEXPEND",
                   as: "e",
                   cond:{ $eq: [ "$$e.Title", 1 ]}
                }
            }
         },
         SSCCANCELATIONS: {
            $size: "$SSCCANCELATIONS"
         },
         SSCVALIDATIONS:{
            $size: {
               $filter: {
                   input: "$SSCVALIDATIONS",
                   as: "v",
                   cond: {$lt: [ "$$v.Result", 10 ]}
                }
            }
         }

      }
}
])

Then simply put your $group to get the sum of all SSCEXPEND, SSCCANCELATIONS, etc...

Sign up to request clarification or add additional context in comments.

Comments

5

Using mongo aggregation you can find out count, check below aggregation query

db.Lines.aggregate([
{
    "$unwind": "$SSCEXPEND"
},
{
    "$unwind": "$SSCVALIDATIONS"
},
{
    "$match": {
        "$and": [
            {
                "SSCEXPEND.Title": 1
            },
            {
                "SSCVALIDATIONS.Result": {
                    "$gt": 10
                }
            }
        ]
    }
},
{
    "$group": {
        "_id": "$_id",
        "SSCEXPEND": {
            "$addToSet": "$SSCEXPEND"
        },
        "SSCVALIDATIONS": {
            "$addToSet": "$SSCVALIDATIONS"
        },
        "SSCCANCELATIONS": {
            "$first": "$SSCCANCELATIONS"
        }
    }
},
{
    "$project": {
        "SSCEXPENDCOUNT": {
            "$size": "$SSCEXPEND"
        },
        "SSCVALIDATIONSCOUNT": {
            "$size": "$SSCVALIDATIONS"
        },
        "SSCCANCELATIONSCOUNT": {
            "$size": "$SSCCANCELATIONS"
        }
    }
}
]).pretty()

3 Comments

It gets quite slow when I add further aggregations and projections (due to my data), but it does exactly what I needed. Thank you very much.
@IvanBernardo if you create indexing and sharding probably it load much faster. Mongo aggregation having problem if we used unwind in aggregation it creates Cartesian problem that's why it takes time to load. Hope so MongoDB will update in coming version.
Yes, sharding desing will be one of the next steps we'll have to get into. Thanks again.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.