1

I'm trying to store the total number of records input to the pipe so I can use the number in a later calculation. How do I grab the number of inputs, then unwind, then use the number of inputs later in my calcs?

I can get the number by doing this:

db.articles.aggregate([
  {
    $count: "totalArticles"
  }
]}

I can get the rest of the data I want by doing this:

db.articles.aggregate([
  {
    $unwind: "$concepts"
  },
  {
    $group: {
      _id:    "$concepts.text",
      count: {
        $sum: 1
      },
      average: {
        $avg: "$concepts.relevance"
      },
    }
  }
])

What I'd really like to do is this:

db.articles.aggregate([
  {
    $count: "totalArticles"
  },
  {
    $unwind: "$concepts"
  },
  {
    $group: {
      _id:    "$concepts.text",
      count: {
        $sum: 1
      },
      average: {
        $avg: "$concepts.relevance"
      }
    }
  },
  {
    $project: {
      count: "$count",
      percent: {
        $divide: [ "$count", "$totalArticles" ]
      }
    }
  },
  {
    $sort: {
      count: -1
    }
  }
])

1 Answer 1

2

You can use below aggregation query.

Initial $group to calculate the total count while $push the concepts field into array field. $$ROOT to access the whole doc.

Retain the total articles count in next $group.

Rest all stays the way you've.

db.articles.aggregate([
  {"$group":{
    "_id":null,
    "totalArticles":{"$sum":1},
    "concepts":{"$push":"$$ROOT.concepts"}
  }},
  {"$unwind":"$concepts"},
  {"$group":{
    "_id":"$concepts.text",
    "totalArticles":{"$first":"$totalArticles"},
    "count":{"$sum":1},
    "average":{"$avg":"$concepts.relevance"}
  }},
  {"$project":{
      "count": "$count",
      "percent": {
        "$divide": [ "$count", "$totalArticles" ]
      }
    }
  },
  {"$sort": {"count": -1}}
])

$facets is also an option where you can two queries in two separate pipeline followed by merge to continue with rest of stages.

Sign up to request clarification or add additional context in comments.

1 Comment

That works well thank you! I've read about $$ROOT and understand there is a total memory size. Is there a way to reduce that by selecting fields at that stage with $project? As you can see I'm only using the concepts field in the rest of the calc.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.