如何从 mongo 管道中检索每个数组元素?
How to retrieve each single array element from mongo pipeline?
让我们假设这是 mongo-db,
中示例文档的样子
[
{
"_id": "1",
"attrib_1": "value_1",
"attrib_2": "value_2",
"months": {
"2": {
"month": "2",
"year": "2008",
"transactions": [
{
"field_1": "val_1",
"field_2": "val_2",
},
{
"field_1": "val_4",
"field_2": "val_5",
"field_3": "val_6"
},
]
},
"3": {
"month": "3",
"year": "2018",
"transactions": [
{
"field_1": "val_7",
"field_3": "val_9"
},
{
"field_1": "val_10",
"field_2": "val_11",
},
]
},
}
}
]
所需的输出是这样的,(我只展示了第 2 个月和第 3 个月)
id
months
year
field_1
field_2
field_3
1
2
2008
val_1
val_2
1
2
2008
val_4
val_5
val_6
1
3
2018
val_7
val_9
1
3
2018
val_10
val_11
我的尝试:
我在 Py-Mongo,
中尝试过类似的东西
pipeline = [
{
# some filter logic here to filter data basically first
},
{
"$addFields": {
"latest": {
"$map": {
"input": {
"$objectToArray": "$months",
},
"as": "obj",
"in": {
"all_field_1" : {"$ifNull" : ["$$obj.v.transactions.field_1", [""]]},
"all_field_2": {"$ifNull" : ["$$obj.v.transactions.field_2", [""]]},
"all_field_3": {"$ifNull" : ["$$obj.v.transactions.field_3", [""]]},
"all_months" : {"$ifNull" : ["$$obj.v.month", ""]},
"all_years" : {"$ifNull" : ["$$obj.v.year", ""]},
}
}
}
}
},
{
"$project": {
"_id": 1,
"months": "$latest.all_months",
"year": "$latest.all_years",
"field_1": "$latest.all_field_1",
"field_2": "$latest.all_field_2",
"field_3": "$latest.all_field_3",
}
}
]
# and I executed it as
my_db.collection.aggregate(pipeline, allowDiskUse=True)
以上实际上是将数据带入,但它是将它们带入列表。有没有一种方法可以轻松地将它们放在 mongo 本身的每一行中?
以上是这样带数据的,
id
months
year
field_1
field_2
field_3
1
["2", "3"]
["2008", "2018"]
[["val_1", "val_4"], ["val_7", "val_10"]]
[["val_2", "val_5"], ["", "val_11"]]
[["", "val_6"], ["val_9", ""]]
非常感谢您就相同问题和更好的方法提出宝贵意见!
感谢您的宝贵时间。
我的 Mongo 版本是 3.4.6,我使用 PyMongo 作为我的驱动程序。您可以在 mongo-db-playground
查看正在运行的查询
在聚合查询中执行所有过程可能不是个好主意,您可以在客户端执行此操作,
我创建了一个冗长的查询,可能会导致大数据出现性能问题,
$objectToArray
将 months
对象转换为数组
$unwind
解构月份数组
$unwind
解构transactions
数组并提供索引字段index
$group
by _id, year, month and index
,并从字段 中的事务中获取第一个对象
$project
如果你愿意,你可以设计你的回应,否则这是可选的,我已经在操场上添加了 link
my_db.collection.aggregate([
{ # some filter logic here to filter data basically first },
{ $project: { months: { $objectToArray: "$months" } } },
{ $unwind: "$months" },
{
$unwind: {
path: "$months.v.transactions",
includeArrayIndex: "index"
}
},
{
$group: {
_id: {
_id: "$_id",
year: "$months.v.year",
month: "$months.v.month",
index: "$index"
},
fields: { $first: "$months.v.transactions" }
}
}
], allowDiskUse=True);
让我们假设这是 mongo-db,
中示例文档的样子[
{
"_id": "1",
"attrib_1": "value_1",
"attrib_2": "value_2",
"months": {
"2": {
"month": "2",
"year": "2008",
"transactions": [
{
"field_1": "val_1",
"field_2": "val_2",
},
{
"field_1": "val_4",
"field_2": "val_5",
"field_3": "val_6"
},
]
},
"3": {
"month": "3",
"year": "2018",
"transactions": [
{
"field_1": "val_7",
"field_3": "val_9"
},
{
"field_1": "val_10",
"field_2": "val_11",
},
]
},
}
}
]
所需的输出是这样的,(我只展示了第 2 个月和第 3 个月)
id | months | year | field_1 | field_2 | field_3 |
---|---|---|---|---|---|
1 | 2 | 2008 | val_1 | val_2 | |
1 | 2 | 2008 | val_4 | val_5 | val_6 |
1 | 3 | 2018 | val_7 | val_9 | |
1 | 3 | 2018 | val_10 | val_11 |
我的尝试:
我在 Py-Mongo,
中尝试过类似的东西pipeline = [
{
# some filter logic here to filter data basically first
},
{
"$addFields": {
"latest": {
"$map": {
"input": {
"$objectToArray": "$months",
},
"as": "obj",
"in": {
"all_field_1" : {"$ifNull" : ["$$obj.v.transactions.field_1", [""]]},
"all_field_2": {"$ifNull" : ["$$obj.v.transactions.field_2", [""]]},
"all_field_3": {"$ifNull" : ["$$obj.v.transactions.field_3", [""]]},
"all_months" : {"$ifNull" : ["$$obj.v.month", ""]},
"all_years" : {"$ifNull" : ["$$obj.v.year", ""]},
}
}
}
}
},
{
"$project": {
"_id": 1,
"months": "$latest.all_months",
"year": "$latest.all_years",
"field_1": "$latest.all_field_1",
"field_2": "$latest.all_field_2",
"field_3": "$latest.all_field_3",
}
}
]
# and I executed it as
my_db.collection.aggregate(pipeline, allowDiskUse=True)
以上实际上是将数据带入,但它是将它们带入列表。有没有一种方法可以轻松地将它们放在 mongo 本身的每一行中?
以上是这样带数据的,
id | months | year | field_1 | field_2 | field_3 |
---|---|---|---|---|---|
1 | ["2", "3"] | ["2008", "2018"] | [["val_1", "val_4"], ["val_7", "val_10"]] | [["val_2", "val_5"], ["", "val_11"]] | [["", "val_6"], ["val_9", ""]] |
非常感谢您就相同问题和更好的方法提出宝贵意见!
感谢您的宝贵时间。
我的 Mongo 版本是 3.4.6,我使用 PyMongo 作为我的驱动程序。您可以在 mongo-db-playground
查看正在运行的查询在聚合查询中执行所有过程可能不是个好主意,您可以在客户端执行此操作,
我创建了一个冗长的查询,可能会导致大数据出现性能问题,
$objectToArray
将months
对象转换为数组$unwind
解构月份数组$unwind
解构transactions
数组并提供索引字段index
$group
by_id, year, month and index
,并从字段 中的事务中获取第一个对象
$project
如果你愿意,你可以设计你的回应,否则这是可选的,我已经在操场上添加了 link
my_db.collection.aggregate([
{ # some filter logic here to filter data basically first },
{ $project: { months: { $objectToArray: "$months" } } },
{ $unwind: "$months" },
{
$unwind: {
path: "$months.v.transactions",
includeArrayIndex: "index"
}
},
{
$group: {
_id: {
_id: "$_id",
year: "$months.v.year",
month: "$months.v.month",
index: "$index"
},
fields: { $first: "$months.v.transactions" }
}
}
], allowDiskUse=True);