MongoDB 聚合管道项目来自同一 ID 的多个值

MongoDB aggregation pipeline project several values from the same id

我正在努力处理 mongoDB 管道。我正在研究处理数据的 MERN 堆栈。

我们通过表格向人们提问,表格被描述为一个会话,例如每人一次。每个会话都记录在 table 中:

{ _id: 1, created_at:"01/01/2021"}
{ _id: 2, created_at:"02/01/2021"}
{ _id: 3, created_at:"03/01/2021"}

他们所有的答案都存储在一个 table 中,就像外键是 sessionId:

{ _id: 1, value:"Name1", sessionId : 1, typeofField :"name"}
{ _id: 2, value:"Firstname1", sessionId : 1, typeofField :"firstname"}
{ _id: 3, value:"Date of birth1", sessionId : 1, typeofField :"birthdate"}
{ _id: 4, value:"Name2", sessionId : 2, typeofField :"name"}
{ _id: 5, value:"Firstname2", sessionId : 2, typeofField :"firstname"}
{ _id: 6, value:"Date of birth2", sessionId : 2, typeofField :"birthdate"}

我如何投影此数据以按如下顺序包含会话的所有信息:

{id :1, created_at:"01/01/2021", name : "Name1", firstname: "Firstname1", birthdate : "Date of Birth1"}
{id :2, created_at:"02/01/2021", name : "Name2", firstname: "Firstname2", birthdate : "Date of Birth2"}

这是我的解决方案:

  1. 找到 $lookup 阶段
  2. 会话的所有答案
  3. 将所有答案转换为一个对象,例如:{ [typeofField]: value }
  4. 将所有答案合并到一个对象
  5. 最后将新转换的answers对象与根文档($$ROOT会话集合的文档)合并

如果您不了解管道,我已经创建了一个 mongodb 游乐场 (Playground Link),所以请尝试一次执行一个阶段。

请自行参考此管道中使用的 stagesoperators 的文档。 $lookup, $addFields, $arrayToObject, $mergeObjects, $replaceRoot, $unset.

注意:确保 $lookup 阶段 as 字段使用的值不会出现在 answers 中 收集 typeofField,否则它将在 $unset 阶段被删除。因此,对于 answers 集合下方的管道,不应包含 { ... typeofField: "allAnswers" ... }.

流水线

[
  {
    $lookup: {
      from: "answers",
      localField: "_id",
      foreignField: "sessionId",
      pipeline: [
        { $addFields: { keyValue: [["$typeofField", "$value"]] } },
        { $replaceRoot: { newRoot: { $arrayToObject: "$keyValue" } } },
      ],
      as: "allAnswers",
    },
  },
  {
    $replaceRoot: {
      newRoot: { $mergeObjects: [{ $mergeObjects: "$allAnswers" }, "$$ROOT"] },
    },
  },
  { $unset: "allAnswers" },
]

对于 5.0 之前的用户,使用此查找:

    $lookup: {                                                                                                        
      from: "answers",                                                                                               
        let: { sid: "$_id" },                                                                                         
        pipeline: [                                                                                                   
            { $match: {$expr: {$eq: ["$sessionId", "$$sid"]}} },                                                      
            { $addFields: { keyValue: [["$typeofField", "$value"]] } }                                                
            ,{ $replaceRoot: { newRoot: { $arrayToObject: "$keyValue" } } },                                          
        ],                                                                                                            
        as: "allAnswers",                                                                                             
    },

另一种解决方案,朝另一个方向发展(从答案到会话):

c = db.answers.aggregate([
    // Bring all answers together as a k-v array:                                                                     
    {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}

    // Do a 1:1 lookup:                                                                                               
    ,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}

    // We now have flds as a k-v array.  We know that Z[0] cotains both                                               
    // created_at and _id.  We seek to create a full k-v array that we can                                            
    // turn into the target object, so working the expression below "backwards"                                       
    // 1. Pull element 0 from the Z array                                                                             
    // 2. Turn that into a k-v array, e.g. [{k:_id,v:1},{k:created_at,v:02/01/2021}]                                  
    //    with $objectToArray.  Important: we pick up _id here.                                                       
    // 3. Concat the flds k-v array with the Xsession lookup k-v array                                                
    // 4. We now have a complete k-v representation of our data.  Use $arrayToObject                                  
    //    to turn (e.g.) {k:created_at,v:02/01/2021} into created_at:02/01/2021                                       
    // 5. Don't assign the object to a fld (like X).  Instead make that object the                                    
    //    new root.  newRoot is the only arg to $replaceRoot:                                                         
    ,{$replaceRoot: { newRoot:
        {$arrayToObject:
           {$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}

]);

或者,如果您想对字段进行更多控制,而不是选择 sesssion 文档中的所有内容:

c = db.answers.aggregate([
    {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
    ,{$lookup: {from: "session", localField: "_id", foreignField: "_id", as: "Z"}}

    // Don't want all the fields from the lookup?  No problem: wrap the                                               
    // $objectToArray with a filter and only let k = [_id,created_at,foo]                                             
    // or whatever else you want.  Make sure to always include _id.                                                   
    // Of course, if you want to exclude fields and keep the rest, just use                                           
    // the $not operator.  Be sure not to exclude _id; see commented cond below:                                      
    ,{$replaceRoot: {newRoot: {$arrayToObject: {$concatArrays: [ "$flds",
                     {$filter: {input: {$objectToArray: {$arrayElemAt: ["$Z",0]}},
                                   as: "z",
                                cond: {$in: ["$$z.k", ["_id","created_at","foo"]]}
                                //cond: {$not: {$in: ["$$z.k", ["foo"]]}}                                             
                               }}
                                                               ]
                                               }}
                    }}
]);

根据在 session 中查找的 material 的数量,您可能想要使用更高级的 $lookup 版本来过滤那里的字段:

c = db.answers.aggregate([
    {$group: {_id: "$sessionId", flds: {$push: {k: "$typeofField", v: "$value"}}}}
    ,{$lookup: {from: "session",
                let: { sid: "$_id" },
                pipeline: [
                    {$match: {$expr: {$eq: [ "$_id", "$$sid" ]} }},
                    {$project: {"_id":true, "created_at":true,"foo":true}}
                ],
                as: "Z"
               }}
    ,{$replaceRoot: { newRoot:
        {$arrayToObject:
           {$concatArrays: [ "$flds", {$objectToArray: {$arrayElemAt: ["$Z",0]}} ] }}}}
]);