如何从 mongo 管道中检索每个数组元素?

How to retrieve each single array element from mongo pipeline?

让我们假设这是 mongo-db,

中示例文档的样子
[
  {
    "_id": "1",
    "attrib_1": "value_1",
    "attrib_2": "value_2",
    "months": {
      "2": {
        "month": "2",
        "year": "2008",
        "transactions": [
          {
            "field_1": "val_1",
            "field_2": "val_2",
            
          },
          {
            "field_1": "val_4",
            "field_2": "val_5",
            "field_3": "val_6"
          },
          
        ]
      },
      "3": {
        "month": "3",
        "year": "2018",
        "transactions": [
          {
            "field_1": "val_7",
            "field_3": "val_9"
          },
          {
            "field_1": "val_10",
            "field_2": "val_11",
            
          },
          
        ]
      },
      
    }
  }
]

所需的输出是这样的,(我只展示了第 2 个月和第 3 个月)

id months year field_1 field_2 field_3
1 2 2008 val_1 val_2
1 2 2008 val_4 val_5 val_6
1 3 2018 val_7 val_9
1 3 2018 val_10 val_11

我的尝试:

我在 Py-Mongo,

中尝试过类似的东西
pipeline = [
    {
        # some filter logic here to filter data basically first
    },
    {
        "$addFields": {
            "latest": {
                "$map": {
                    "input": {
                        "$objectToArray": "$months",
                    },
                    "as": "obj",
                    "in": {
                        "all_field_1" : {"$ifNull" : ["$$obj.v.transactions.field_1", [""]]},
                        "all_field_2": {"$ifNull" : ["$$obj.v.transactions.field_2", [""]]},
                        "all_field_3": {"$ifNull" : ["$$obj.v.transactions.field_3", [""]]},
                        "all_months" : {"$ifNull" : ["$$obj.v.month", ""]},
                        "all_years" : {"$ifNull" : ["$$obj.v.year", ""]},
                    }
                }
            }
        }
    },
    {
        "$project": {
            "_id": 1,
            "months": "$latest.all_months",
            "year":  "$latest.all_years",
            "field_1": "$latest.all_field_1",
            "field_2": "$latest.all_field_2",
            "field_3": "$latest.all_field_3",

        }
    }
]

# and I executed it as
my_db.collection.aggregate(pipeline, allowDiskUse=True)

以上实际上是将数据带入,但它是将它们带入列表。有没有一种方法可以轻松地将它们放在 mongo 本身的每一行中?

以上是这样带数据的,

id months year field_1 field_2 field_3
1 ["2", "3"] ["2008", "2018"] [["val_1", "val_4"], ["val_7", "val_10"]] [["val_2", "val_5"], ["", "val_11"]] [["", "val_6"], ["val_9", ""]]

非常感谢您就相同问题和更好的方法提出宝贵意见!

感谢您的宝贵时间。

我的 Mongo 版本是 3.4.6,我使用 PyMongo 作为我的驱动程序。您可以在 mongo-db-playground

查看正在运行的查询

在聚合查询中执行所有过程可能不是个好主意,您可以在客户端执行此操作,

我创建了一个冗长的查询,可能会导致大数据出现性能问题,

  • $objectToArraymonths 对象转换为数组
  • $unwind解构月份数组
  • $unwind解构transactions数组并提供索引字段index
  • $group by _id, year, month and index,并从字段
  • 中的事务中获取第一个对象
  • $project 如果你愿意,你可以设计你的回应,否则这是可选的,我已经在操场上添加了 link
my_db.collection.aggregate([
  { # some filter logic here to filter data basically first },
  { $project: { months: { $objectToArray: "$months" } } },
  { $unwind: "$months" },
  {
    $unwind: {
      path: "$months.v.transactions",
      includeArrayIndex: "index"
    }
  },
  {
    $group: {
      _id: {
        _id: "$_id",
        year: "$months.v.year",
        month: "$months.v.month",
        index: "$index"
      },
      fields: { $first: "$months.v.transactions" }
    }
  }
], allowDiskUse=True);

Playground