如何使用 MongoDB 聚合对多个操作进行分组

How to group multiple operations with using MongoDB aggregation

给定以下数据:

> db.users.find({}, {name: 1, createdAt: 1, updatedAt: 1}).limit(5).pretty()
{
    "_id" : ObjectId("5ec8f74f32973c7b7cb7cce9"),
    "createdAt" : ISODate("2020-05-23T10:13:35.012Z"),
    "updatedAt" : ISODate("2020-08-20T13:37:09.861Z"),
    "name" : "Patrick Jere"
}
{
    "_id" : ObjectId("5ec8ef8a2b6e5f78fa20443c"),
    "createdAt" : ISODate("2020-05-23T09:40:26.089Z"),
    "updatedAt" : ISODate("2020-07-23T07:54:01.833Z"),
    "name" : "Austine Wiga"
}
{
    "_id" : ObjectId("5ed5e1a3962a3960ad85a1a2"),
    "createdAt" : ISODate("2020-06-02T05:20:35.090Z"),
    "updatedAt" : ISODate("2020-07-29T14:02:52.295Z"),
    "name" : "Biasi Phiri"
}
{
    "_id" : ObjectId("5ed629ec6d87382c608645d9"),
    "createdAt" : ISODate("2020-06-02T10:29:00.204Z"),
    "updatedAt" : ISODate("2020-06-02T10:29:00.204Z"),
    "name" : "Chisambwe Kalusa"
}
{
    "_id" : ObjectId("5ed8d21f42bc8115f67465a8"),
    "createdAt" : ISODate("2020-06-04T10:51:11.546Z"),
    "updatedAt" : ISODate("2020-06-04T10:51:11.546Z"),
    "name" : "Wakun Moyo"
}
...

Sample Data

我使用以下查询 return new_users 按月计算:

db.users.aggregate([
    {
        $group: {
            _id: {$dateToString: {format: '%Y-%m', date: '$createdAt'}},
            new_users: {
                $sum: {$ifNull: [1, 0]}
            }
        }
    }
])

示例结果:

[
  {
    "_id": "2020-06",
    "new_users": 125
  },
  {
    "_id": "2020-07",
    "new_users": 147
  },
  {
    "_id": "2020-08",
    "new_users": 43
  },
  {
    "_id": "2020-05",
    "new_users": 4
  }
]

和此查询 returns new_usersactive_userstotal users 特定月份。

db.users.aggregate([
    {
        $group: {
            _id: null,
            new_users: {
                $sum: {
                    $cond: [{
                        $gte: ['$createdAt', ISODate('2020-08-01')]
                    }, 1, 0]
                }
             },
            active_users: {
                $sum: {
                    $cond: [{
                        $gt: ['$updatedAt', ISODate('2020-02-01')]
                    }, 1, 0]
                }
            },
            total_users: {
                $sum: {$ifNull: [1, 0]}
            }
        }
    }
])

我怎样才能像第一个查询一样按月获得对 return 结果的第二个查询?

基于一个月过滤器的预期结果:

[
  { _id: '2020-09', new_users: 0, active_users: 69},
  { _id: '2020-08', new_users: 43, active_users: 219},
  { _id: '2020-07', new_users: 147, active_users: 276},
  { _id: '2020-06', new_users: 125, active_users: 129},
  { _id: '2020-05', new_users: 4, active_users: 4}
]

您可以像在第一个查询中那样做,按 cteatedAt 分组,无需在 total_users

中使用 $ifNull 运算符

Playground


已更新,

  • 按月使用 $facet 分组并统计两者
  • $project 使用 $concatArrays
  • 连接两个数组
  • $unwind解构数组root
  • $group 按月合并月份和计数

Playground

您可以尝试以下聚合。

计算新用户数,然后查找以计算每年每月 window 时间的活跃用户数。

db.users.aggregate([
{"$group":{
  "_id":{"$dateFromParts":{"year":{"$year":"$createdAt"},"month":{"$month":"$createdAt"}}},
  "new_users":{"$sum":1}
}},
{"$lookup":{
   "from":"users",
    "let":{"end_date":"$_id", "start_date":{"$dateFromParts":{"year":{"$year":"$_id"},"month":{"$subtract":[{"$month":"$_id"},1]}}}},
    "pipeline":[
      {"$match":{"$expr":
        {"$and":[{"$gte":[
          "$updatedAt",
          "$$start_date"
        ]}, {"$lt":[
          "$updatedAt",
          "$$end_date"
        ]}]}
      }},
      {"$count":"activeUserCount"}
    ],
  "as":"activeUsers"
}},
{"$project":{
  "year-month":{"$dateToString":{"format":"%Y-%m","date":"$_id"}}, 
  "new_users":1, 
  "active_users":{"$arrayElemAt":["$activeUsers.activeUserCount", 0]},
  "_id":0
}}])