1

Description

After reworking my data construct, I want to embed a document (or parts of it) from a different collection. Pay your attention to event.tags[]. Currently, there are only their stringified _ids (as plain string, no ObjectId!) stored, but I want to migrate their content, like {tags.name, tags._id}. I want to update/replace the original event.tags with the content of tag-documents.

Data

Collection "events"

db.events.insertMany([{
    "_id" : ObjectId("5f6c5b0594911c0b7231643d"),
    "title" : "Test Event",
    "tags" : [ 
        "5fd8cef2368c625a9d7516cb"
    ]
},
{
    "_id" : ObjectId("5fb3896afc13ae2ed1000000"),
    "title" : "Test Event 2",
    "tags" : [ 
        "5fd8cef2368c625a9d7516cb", 
        "5fd8cfb04a4a6063ab4ca4bf", 
        "5fd8cfb04a4a6063ab4ca4c0"
    ]
}
]);

Collection "tags"

db.tags.insertMany([{
    "_id" : ObjectId("5fd8cef2368c625a9d7516cb"),
    "name" : "Foo",
    "createdAt" : ISODate("2020-12-15T14:57:54.096Z")
},
{
    "_id" : ObjectId("5fd8cfb04a4a6063ab4ca4bf"),
    "name" : "Bar",
    "createdBy" : "[email protected]",
    "createdAt" : ISODate("2020-12-16T14:57:54.096Z")
},
{
    "_id" : ObjectId("5fd8cfb04a4a6063ab4ca4c0"),
    "name" : "Foobar",
    "createdAt" : ISODate("2020-12-17T14:57:54.096Z")
}
]);

Expected result

I want to achieve to embed some/all fields of collection "tags" matched by "events.tags" <-> "tags._id" as an array of embedded/nested documents. This will be part of a migration script, running over mongoDB API >4.0 thru node.js.

Goal: I want to replace an Array of Strings (see collection events) with certain information of a DIFFERENT collection (see collection tags) and save it back to collection events (migrate process). So I assume, I have to fetch/lookup something from tag collection and need a mechanism to replace the string (events.tags[x]) with complex objects ({tags._id, tags.name})

{
    "_id" : ObjectId("5f6c5b0594911c0b7231643d"),
    "title" : "Test Event",
    "tags" : [{
        "_id" : ObjectId("5fd8cef2368c625a9d7516cb"),
        "name" : "Foo",
        "createdAt" : ISODate("2020-12-15T14:57:54.096Z")
    }]
},
{
    "_id" : ObjectId("5fb3896afc13ae2ed1000000"),
    "title" : "Test Event 2",
    "tags" : [{
        "_id" : ObjectId("5fd8cef2368c625a9d7516cb"),
        "name" : "Foo"
    },
    {
        "_id" : ObjectId("5fd8cfb04a4a6063ab4ca4bf"),
        "name" : "Bar"
    },
    {
        "_id" : ObjectId("5fd8cfb04a4a6063ab4ca4c0"),
        "name" : "Foobar"
    }]
}

Baby steps

I tried to use the aggregation pipeline and started by matching and lookup their content to replaceWith in the last step. I am not too familiar to solve this via the aggregation pipeline.

db.getCollection('events').aggregate([
    { $match: { tags: { "$exists" : true } } },
    {
       $lookup:
         {
           from: "tags",
           localField: "tags",
           foreignField: "_id",
           as: "taags"
         }
    },
    ])

1 Answer 1

1

You can convert tag's id from string to objectId using $toObjectId operator, before $lookup stage,

  • $addFields to add new fields and update/format existing fields
  • $map to iterate loop of tags array
  • $toObjectId to convert string objectId to objectId type
  • put the name tags in as property of $lookup stage
  // $match stage here
  {
    $addFields: {
      tags: {
        $map: {
          input: "$tags",
          in: { $toObjectId: "$$this" }
        }
      }
    }
  },
  {
    $lookup: {
      from: "tags",
      localField: "tags",
      foreignField: "_id",
      as: "tags"
    }
  }

Playground

  • With the aggregation query you can not update current events collection, but you can use $out stage to result in new collection, supported from MongoDB 4.4,
  • put this stage at last stage
  { $out: "updated_events" }

The second option is to prepare a script to do the manual process,

  1. Find all records from the events collection
  2. loop the results of events collection
  3. inside loop execute find query by passing tags ids in tags collection
  4. inside loop update events collection by replacing result from 3) step
Sign up to request clarification or add additional context in comments.

11 Comments

Thanks for your fast response. Are there any insights about the replaceAll pipeline usage?
the definition in docs "Replaces all instances of a search string in an input string with a replacement string." is enough to understand $replaceAll operator.
I appreciate your effort and guidance. Nevertheless, I struggle with the aspect to replace an array of IDs with complex, nested objects - no plain string replacements like renaming in lower/upper cases. I replaced $replaceAll with $replaceWith because I linked the wrong one. So I am still in doubt if this will suit my needs. Do you agree?
do you want to update the documents permanently in collection? if yes then aggregation only format the result means it will fetch documents only can't update in collection.
was a mistake, you can not update any collection by aggregation, but you can use $out stage to result in new collection, after that you can remove old collection and rename the new collection to events
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.