This is a common misunderstanding of the query process. I suspect one of your docs looks like this:
{
"_id" : 0,
"replies" : [
{ "_id" : "R0", status: "rejected"}
,{ "_id" : "R1", status: "approved"}
]
}
The issue is that doing a find on an array will match any doc where AT LEAST one of the array entries matches; it does NOT filter the result to JUST those entries. Here are two approaches. Given this data setup:
var r =
[
{
"_id" : 0,
"replies" : [
{ "_id" : "R0", status: "rejected"}
,{ "_id" : "R1", status: "approved"}
]
}
,{
"_id" : 1,
"replies" : [
{ "_id" : "R2", status: "rejected"}
,{ "_id" : "R3", status: "rejected"}
]
}
,{
"_id" : 2,
"replies" : [
{ "_id" : "R4", status: "rejected"}
,{ "_id" : "R5", status: "approved"}
]
}
];
Approach 1: Simple and the embedded array is small (dozens not 100s or 1000s of entries.
db.foo.aggregate([
// This will find all docs where ANY of the replies array has AT LEAST ONE
// entry "approved." It will NOT filter just those.
{$match: {"replies.status": "approved"}}
// Now that we have them, unwind and "refilter"
,{$unwind: "$replies"}
,{$match: {"replies.status": "approved"}}
]);
{ "_id" : 0, "replies" : { "_id" : "R1", "status" : "approved" } }
{ "_id" : 2, "replies" : { "_id" : "R5", "status" : "approved" } }
Approach 2: Use $filter if array is very large and using $unwind creates 1000s of docs. This approach also is useful in preserving the structure of the original doc:
db.foo.aggregate([
// This will find all docs where ANY of the replies array has AT LEAST ONE
// entry "approved." It will NOT filter just those.
{$match: {"replies.status": "approved"}}
// To avoid $unwind, directly filter just "approved" and reset the replies
// field back into the parent doc:
,{$addFields: {replies: {$filter: {
input: "$replies",
as: "zz",
cond: { $eq: [ "$$zz.status", "approved" ] }
}}
}}
]);
/*
{
"_id" : 0,
"replies" : [
{
"_id" : "R1",
"status" : "approved"
}
]
}
{
"_id" : 2,
"replies" : [
{
"_id" : "R5",
"status" : "approved"
}
]
}
}