Mongo 查询以从子文档数组中获取重复条目
Mongo Query to get duplicate entry from sub document array
这里的objective是根据条件查找子文档中的重复记录,return输出如下
数据集
[{
_id: "objectId",
name: "product_a",
array: [{
_id: "objectId",
start: "2022-01-01 00:00:00.000Z",
end: "2022-01-30 00:00:00.000Z",
status: "active",
person: "A" //reference objectId
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "A"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "A"
},
{
_id: "objectId",
start: "2022-02-01 00:00:00.000Z",
end: null,
status: "active",
person: "B"
}]
},
{
_id: "objectId",
name: "product_b",
array: [{
_id: "objectId",
start: "2021-12-30 00:00:00.000Z",
end: "2022-01-30 00:00:00.000Z",
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-20 00:00:00.000Z",
end: null,
status: "active",
person: "D"
}]
}]
预期输出
[
{
_id: "objectId",
name: "product_a",
targetIds: ["A"]
},
{
_id: "objectId",
name: "product_b",
targetIds: ["C"]
}
]
我正在尝试从每个文档中获取重复的人值作为数组 (targetIds),其中该人在子文档中有两个活动记录,结尾为 null。以下是我试过的查询
connectCollection.aggregate([
{
$unwind: "$array"
},
{
"$match": {
$and: [
{
"array.status": {
"$exists": true,
"$eq": "active"
}
},
{
"array.end": {
"$exists": true,
"$eq": null
}
}
]
}
},
{
$group: {
_id: {
product_id: "$_id",
product_name: "$name",
targetIds: "$array.person"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{ $sort: { _id: 1 } }
])
你只差最后一步:
{
$group: {
_id: "$_id.product_id",
name: {
$first: "$_id.product_name"
},
targetIds: {
$push: "$_id.targetIds"
}
}
}
并将示例数据的第二个数组从 user
更改为 person
,如您的查询(和@Yong Shun)所述。
您可以看到它可以工作 here 输入数据中的一些小变化,以演示解决方案
这里的objective是根据条件查找子文档中的重复记录,return输出如下
数据集
[{
_id: "objectId",
name: "product_a",
array: [{
_id: "objectId",
start: "2022-01-01 00:00:00.000Z",
end: "2022-01-30 00:00:00.000Z",
status: "active",
person: "A" //reference objectId
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "A"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "A"
},
{
_id: "objectId",
start: "2022-02-01 00:00:00.000Z",
end: null,
status: "active",
person: "B"
}]
},
{
_id: "objectId",
name: "product_b",
array: [{
_id: "objectId",
start: "2021-12-30 00:00:00.000Z",
end: "2022-01-30 00:00:00.000Z",
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-01 00:00:00.000Z",
end: null,
status: "active",
person: "C"
},
{
_id: "objectId",
start: "2022-03-20 00:00:00.000Z",
end: null,
status: "active",
person: "D"
}]
}]
预期输出
[
{
_id: "objectId",
name: "product_a",
targetIds: ["A"]
},
{
_id: "objectId",
name: "product_b",
targetIds: ["C"]
}
]
我正在尝试从每个文档中获取重复的人值作为数组 (targetIds),其中该人在子文档中有两个活动记录,结尾为 null。以下是我试过的查询
connectCollection.aggregate([
{
$unwind: "$array"
},
{
"$match": {
$and: [
{
"array.status": {
"$exists": true,
"$eq": "active"
}
},
{
"array.end": {
"$exists": true,
"$eq": null
}
}
]
}
},
{
$group: {
_id: {
product_id: "$_id",
product_name: "$name",
targetIds: "$array.person"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{ $sort: { _id: 1 } }
])
你只差最后一步:
{
$group: {
_id: "$_id.product_id",
name: {
$first: "$_id.product_name"
},
targetIds: {
$push: "$_id.targetIds"
}
}
}
并将示例数据的第二个数组从 user
更改为 person
,如您的查询(和@Yong Shun)所述。
您可以看到它可以工作 here 输入数据中的一些小变化,以演示解决方案