使用 $in 查询时,如何为 mongoDB 中的每个唯一 ID 获取 n 个文档?
How to get n documents for each unique id in mongoDB when queried using $in?
我有一个集合,它每 2 秒为每个 uID(主键)存储一次数据。现在我想为每个 uID 查询最后 30 个文档,并根据它们的索引号将它们分组在一起。在这种情况下如何进行?如果我应用 $limit,这 returns 所有 uID 只有 30 个文档。
db.getCollection("devices").aggregate(
[
{
"$match" : {
"uID" : {
"$in" : [
"20200308",
"20200306",
"12345678"
]
}
}
}
],
{
"allowDiskUse" : false
}
);
上面的查询会return我所有的文档,如何限制查询每个uID 30个文档?此外,如果查询成功,如何对不同文档的所有数组索引进行分组,以便我获得将 30 个文档组合在一起的字段的值总和。示例:
[
{
_id: 0, // index value of all array after grouping them.
sumOfValuesFoundInArrayIndex: 100,
},
{
_id: 1,
sumOfValuesFoundInArrayIndex: 600,
}
.
.
.
.
{
_id: 29,
sumOfValuesFoundInArrayIndex: 600,
}
]
JSON 示例:
[
{
"timeStamp": 1644962269,
"uID": "20200308",
"capacityLeft": 500
},
{
"timeStamp": 1644962272,
"uID": "20200308",
"capacityLeft": 499
},
{
"timeStamp": 1644962275,
"uID": "20200306",
"capacityLeft": 300
},
{
"timeStamp": 1644962277,
"uID": "20200308",
"capacityLeft": 499
},
{
"timeStamp": 1644962277,
"uID": "20200306",
"capacityLeft": 300
},
{
"timeStamp": 1644962279,
"uID": "12345678",
"capacityLeft": 753
},
{
"timeStamp": 1644962281,
"uID": "12345678",
"capacityLeft": 752
},
{
"timeStamp": 1644962283,
"uID": "12345678",
"capacityLeft": 751
}
]
现在根据 JSON,我需要为每个 uID 找到 30 个文档并使用时间戳对它们进行排序,这样当我查询所有提到的设备时,我会得到提到的 uID 的最后 30 个文档,按数组索引对它们进行分组,然后对剩余的所有容量求和。
在我看来是这样的:
db.collection.aggregate([
{
$match: {
uID: {
$in: [
"20200308",
"20200306"
]
}
}
},
{
$sort: {
uID: 1,
"timeStamp": -1
}
},
{
$group: {
_id: "$uID",
ts: {
$push: "$$ROOT"
}
}
},
{
$project: {
ts: {
$slice: [
"$ts",
30
]
}
}
},
{
$unwind: "$ts"
},
{
$group: {
_id: "$_id",
sumcapLast30: {
$sum: "$ts.capacityLeft"
}
}
}
])
解释:
- 匹配所有需要的 uID
- 按 uID 排序,按时间戳降序
- 按 uID 分组
- 拼接,只保留ts数组的前30个元素
- 展开 ts 数组
- 对每个 uID 的前 30 个元素进行分组和求和
我有一个集合,它每 2 秒为每个 uID(主键)存储一次数据。现在我想为每个 uID 查询最后 30 个文档,并根据它们的索引号将它们分组在一起。在这种情况下如何进行?如果我应用 $limit,这 returns 所有 uID 只有 30 个文档。
db.getCollection("devices").aggregate(
[
{
"$match" : {
"uID" : {
"$in" : [
"20200308",
"20200306",
"12345678"
]
}
}
}
],
{
"allowDiskUse" : false
}
);
上面的查询会return我所有的文档,如何限制查询每个uID 30个文档?此外,如果查询成功,如何对不同文档的所有数组索引进行分组,以便我获得将 30 个文档组合在一起的字段的值总和。示例:
[
{
_id: 0, // index value of all array after grouping them.
sumOfValuesFoundInArrayIndex: 100,
},
{
_id: 1,
sumOfValuesFoundInArrayIndex: 600,
}
.
.
.
.
{
_id: 29,
sumOfValuesFoundInArrayIndex: 600,
}
]
JSON 示例:
[
{
"timeStamp": 1644962269,
"uID": "20200308",
"capacityLeft": 500
},
{
"timeStamp": 1644962272,
"uID": "20200308",
"capacityLeft": 499
},
{
"timeStamp": 1644962275,
"uID": "20200306",
"capacityLeft": 300
},
{
"timeStamp": 1644962277,
"uID": "20200308",
"capacityLeft": 499
},
{
"timeStamp": 1644962277,
"uID": "20200306",
"capacityLeft": 300
},
{
"timeStamp": 1644962279,
"uID": "12345678",
"capacityLeft": 753
},
{
"timeStamp": 1644962281,
"uID": "12345678",
"capacityLeft": 752
},
{
"timeStamp": 1644962283,
"uID": "12345678",
"capacityLeft": 751
}
]
现在根据 JSON,我需要为每个 uID 找到 30 个文档并使用时间戳对它们进行排序,这样当我查询所有提到的设备时,我会得到提到的 uID 的最后 30 个文档,按数组索引对它们进行分组,然后对剩余的所有容量求和。
在我看来是这样的:
db.collection.aggregate([
{
$match: {
uID: {
$in: [
"20200308",
"20200306"
]
}
}
},
{
$sort: {
uID: 1,
"timeStamp": -1
}
},
{
$group: {
_id: "$uID",
ts: {
$push: "$$ROOT"
}
}
},
{
$project: {
ts: {
$slice: [
"$ts",
30
]
}
}
},
{
$unwind: "$ts"
},
{
$group: {
_id: "$_id",
sumcapLast30: {
$sum: "$ts.capacityLeft"
}
}
}
])
解释:
- 匹配所有需要的 uID
- 按 uID 排序,按时间戳降序
- 按 uID 分组
- 拼接,只保留ts数组的前30个元素
- 展开 ts 数组
- 对每个 uID 的前 30 个元素进行分组和求和