如何使用 MongoDB 聚合对多个操作进行分组
How to group multiple operations with using MongoDB aggregation
给定以下数据:
> db.users.find({}, {name: 1, createdAt: 1, updatedAt: 1}).limit(5).pretty()
{
"_id" : ObjectId("5ec8f74f32973c7b7cb7cce9"),
"createdAt" : ISODate("2020-05-23T10:13:35.012Z"),
"updatedAt" : ISODate("2020-08-20T13:37:09.861Z"),
"name" : "Patrick Jere"
}
{
"_id" : ObjectId("5ec8ef8a2b6e5f78fa20443c"),
"createdAt" : ISODate("2020-05-23T09:40:26.089Z"),
"updatedAt" : ISODate("2020-07-23T07:54:01.833Z"),
"name" : "Austine Wiga"
}
{
"_id" : ObjectId("5ed5e1a3962a3960ad85a1a2"),
"createdAt" : ISODate("2020-06-02T05:20:35.090Z"),
"updatedAt" : ISODate("2020-07-29T14:02:52.295Z"),
"name" : "Biasi Phiri"
}
{
"_id" : ObjectId("5ed629ec6d87382c608645d9"),
"createdAt" : ISODate("2020-06-02T10:29:00.204Z"),
"updatedAt" : ISODate("2020-06-02T10:29:00.204Z"),
"name" : "Chisambwe Kalusa"
}
{
"_id" : ObjectId("5ed8d21f42bc8115f67465a8"),
"createdAt" : ISODate("2020-06-04T10:51:11.546Z"),
"updatedAt" : ISODate("2020-06-04T10:51:11.546Z"),
"name" : "Wakun Moyo"
}
...
我使用以下查询 return new_users
按月计算:
db.users.aggregate([
{
$group: {
_id: {$dateToString: {format: '%Y-%m', date: '$createdAt'}},
new_users: {
$sum: {$ifNull: [1, 0]}
}
}
}
])
示例结果:
[
{
"_id": "2020-06",
"new_users": 125
},
{
"_id": "2020-07",
"new_users": 147
},
{
"_id": "2020-08",
"new_users": 43
},
{
"_id": "2020-05",
"new_users": 4
}
]
和此查询 returns new_users
、active_users
和 total users
特定月份。
db.users.aggregate([
{
$group: {
_id: null,
new_users: {
$sum: {
$cond: [{
$gte: ['$createdAt', ISODate('2020-08-01')]
}, 1, 0]
}
},
active_users: {
$sum: {
$cond: [{
$gt: ['$updatedAt', ISODate('2020-02-01')]
}, 1, 0]
}
},
total_users: {
$sum: {$ifNull: [1, 0]}
}
}
}
])
我怎样才能像第一个查询一样按月获得对 return 结果的第二个查询?
基于一个月过滤器的预期结果:
[
{ _id: '2020-09', new_users: 0, active_users: 69},
{ _id: '2020-08', new_users: 43, active_users: 219},
{ _id: '2020-07', new_users: 147, active_users: 276},
{ _id: '2020-06', new_users: 125, active_users: 129},
{ _id: '2020-05', new_users: 4, active_users: 4}
]
您可以像在第一个查询中那样做,按 cteatedAt
分组,无需在 total_users
、
中使用 $ifNull
运算符
已更新,
- 按月使用
$facet
分组并统计两者
$project
使用 $concatArrays
连接两个数组
$unwind
解构数组root
$group
按月合并月份和计数
您可以尝试以下聚合。
计算新用户数,然后查找以计算每年每月 window 时间的活跃用户数。
db.users.aggregate([
{"$group":{
"_id":{"$dateFromParts":{"year":{"$year":"$createdAt"},"month":{"$month":"$createdAt"}}},
"new_users":{"$sum":1}
}},
{"$lookup":{
"from":"users",
"let":{"end_date":"$_id", "start_date":{"$dateFromParts":{"year":{"$year":"$_id"},"month":{"$subtract":[{"$month":"$_id"},1]}}}},
"pipeline":[
{"$match":{"$expr":
{"$and":[{"$gte":[
"$updatedAt",
"$$start_date"
]}, {"$lt":[
"$updatedAt",
"$$end_date"
]}]}
}},
{"$count":"activeUserCount"}
],
"as":"activeUsers"
}},
{"$project":{
"year-month":{"$dateToString":{"format":"%Y-%m","date":"$_id"}},
"new_users":1,
"active_users":{"$arrayElemAt":["$activeUsers.activeUserCount", 0]},
"_id":0
}}])
给定以下数据:
> db.users.find({}, {name: 1, createdAt: 1, updatedAt: 1}).limit(5).pretty()
{
"_id" : ObjectId("5ec8f74f32973c7b7cb7cce9"),
"createdAt" : ISODate("2020-05-23T10:13:35.012Z"),
"updatedAt" : ISODate("2020-08-20T13:37:09.861Z"),
"name" : "Patrick Jere"
}
{
"_id" : ObjectId("5ec8ef8a2b6e5f78fa20443c"),
"createdAt" : ISODate("2020-05-23T09:40:26.089Z"),
"updatedAt" : ISODate("2020-07-23T07:54:01.833Z"),
"name" : "Austine Wiga"
}
{
"_id" : ObjectId("5ed5e1a3962a3960ad85a1a2"),
"createdAt" : ISODate("2020-06-02T05:20:35.090Z"),
"updatedAt" : ISODate("2020-07-29T14:02:52.295Z"),
"name" : "Biasi Phiri"
}
{
"_id" : ObjectId("5ed629ec6d87382c608645d9"),
"createdAt" : ISODate("2020-06-02T10:29:00.204Z"),
"updatedAt" : ISODate("2020-06-02T10:29:00.204Z"),
"name" : "Chisambwe Kalusa"
}
{
"_id" : ObjectId("5ed8d21f42bc8115f67465a8"),
"createdAt" : ISODate("2020-06-04T10:51:11.546Z"),
"updatedAt" : ISODate("2020-06-04T10:51:11.546Z"),
"name" : "Wakun Moyo"
}
...
我使用以下查询 return new_users
按月计算:
db.users.aggregate([
{
$group: {
_id: {$dateToString: {format: '%Y-%m', date: '$createdAt'}},
new_users: {
$sum: {$ifNull: [1, 0]}
}
}
}
])
示例结果:
[
{
"_id": "2020-06",
"new_users": 125
},
{
"_id": "2020-07",
"new_users": 147
},
{
"_id": "2020-08",
"new_users": 43
},
{
"_id": "2020-05",
"new_users": 4
}
]
和此查询 returns new_users
、active_users
和 total users
特定月份。
db.users.aggregate([
{
$group: {
_id: null,
new_users: {
$sum: {
$cond: [{
$gte: ['$createdAt', ISODate('2020-08-01')]
}, 1, 0]
}
},
active_users: {
$sum: {
$cond: [{
$gt: ['$updatedAt', ISODate('2020-02-01')]
}, 1, 0]
}
},
total_users: {
$sum: {$ifNull: [1, 0]}
}
}
}
])
我怎样才能像第一个查询一样按月获得对 return 结果的第二个查询?
基于一个月过滤器的预期结果:
[
{ _id: '2020-09', new_users: 0, active_users: 69},
{ _id: '2020-08', new_users: 43, active_users: 219},
{ _id: '2020-07', new_users: 147, active_users: 276},
{ _id: '2020-06', new_users: 125, active_users: 129},
{ _id: '2020-05', new_users: 4, active_users: 4}
]
您可以像在第一个查询中那样做,按 cteatedAt
分组,无需在 total_users
、
$ifNull
运算符
已更新,
- 按月使用
$facet
分组并统计两者 $project
使用$concatArrays
连接两个数组
$unwind
解构数组root
$group
按月合并月份和计数
您可以尝试以下聚合。
计算新用户数,然后查找以计算每年每月 window 时间的活跃用户数。
db.users.aggregate([
{"$group":{
"_id":{"$dateFromParts":{"year":{"$year":"$createdAt"},"month":{"$month":"$createdAt"}}},
"new_users":{"$sum":1}
}},
{"$lookup":{
"from":"users",
"let":{"end_date":"$_id", "start_date":{"$dateFromParts":{"year":{"$year":"$_id"},"month":{"$subtract":[{"$month":"$_id"},1]}}}},
"pipeline":[
{"$match":{"$expr":
{"$and":[{"$gte":[
"$updatedAt",
"$$start_date"
]}, {"$lt":[
"$updatedAt",
"$$end_date"
]}]}
}},
{"$count":"activeUserCount"}
],
"as":"activeUsers"
}},
{"$project":{
"year-month":{"$dateToString":{"format":"%Y-%m","date":"$_id"}},
"new_users":1,
"active_users":{"$arrayElemAt":["$activeUsers.activeUserCount", 0]},
"_id":0
}}])