需要帮助 MongoDB 汇总 $group 状态

Need help to MongoDB aggregate $group state

我收集了 1000 个这样的文档:

{ 
    "_id" : ObjectId("628b63d66a5951db6bb79905"), 
    "index" : 0, 
    "name" : "Aurelia Gonzales", 
    "isActive" : false, 
    "registered" : ISODate("2015-02-11T04:22:39.000+0000"), 
    "age" : 41, 
    "gender" : "female", 
    "eyeColor" : "green", 
    "favoriteFruit" : "banana", 
    "company" : {
        "title" : "YURTURE", 
        "email" : "aureliagonzales@yurture.com", 
        "phone" : "+1 (940) 501-3963", 
        "location" : {
            "country" : "USA", 
            "address" : "694 Hewes Street"
        }
    }, 
    "tags" : [
        "enim", 
        "id", 
        "velit", 
        "ad", 
        "consequat"
    ]
}

我想按年份和性别对这些进行分组。比如 2014 年男性注册 105 名,女性注册 131 名。最后 return 文件是这样的:

{
    _id:2014,
    male:105,
    female:131,
    total:236
},
{
    _id:2015,
    male:136,
    female:128,
    total:264
}

我试过按 registeredgender 分组,像这样:

db.persons.aggregate([
    { $group: { _id: { year: { $year: "$registered" }, gender: "$gender" }, total: { $sum: NumberInt(1) } } },
    { $sort: { "_id.year": 1,"_id.gender":1 } }
])

这是 return 这样的文档:

{ 
    "_id" : {
        "year" : 2014, 
        "gender" : "female"
    }, 
    "total" : 131
}
{ 
    "_id" : {
        "year" : 2014, 
        "gender" : "male"
    }, 
    "total" : 105
}

请各位大神指点一下。

只需在聚合管道中再添加一个小组阶段,如下所示:

db.persons.aggregate([
    { $group: { _id: { year: { $year: "$registered" }, gender: "$gender" }, total: { $sum: NumberInt(1) } } },
    { $sort: { "_id.year": 1,"_id.gender":1 } },
{
  $group: {
    _id: "$_id.year",
    male: {
      $sum: {
        $cond: {
          if: {
            $eq: [
              "$_id.gender",
              "male"
            ]
          },
          then: "$total",
          else: 0
        }
      }
    },
    female: {
      $sum: {
        $cond: {
          if: {
            $eq: [
              "$_id.gender",
              "female"
            ]
          },
          then: "$total",
          else: 0
        }
      }
    },
    total: {
      $sum: "$total"
    }
  },
}
]);

这是工作 link。我们在这最后一步按年份分组,并有条件地计算性别的计数,总数就是不分性别的计数的总和。

除了评论中提到的@Gibbs,它提出了 2 $group 个阶段的解决方案,

你可以得到如下结果:

  1. $group - 按 registered 的年份分组。将 gender 值添加到 genders 数组中。

  2. $sort - 按 _id.

    排序
  3. $project - 修饰输出文档。

    3.1。 male - 从 $filter “性别”数组中“男性”的值中获取数组的大小。

    3.2。 female - 从 $filter “性别”数组中“女性”的值中获取数组的大小。

    3.3。 total - 获取“性别”数组的大小。

如果您希望统计return“男”和“女”性别字段,建议使用此方法。

db.collection.aggregate([
  {
    $group: {
      _id: {
        $year: "$registered"
      },
      genders: {
        $push: "$gender"
      }
    }
  },
  {
    $sort: {
      "_id": 1
    }
  },
  {
    $project: {
      _id: 1,
      male: {
        $size: {
          $filter: {
            input: "$genders",
            cond: {
              $eq: [
                "$$this",
                "male"
              ]
            }
          }
        }
      },
      female: {
        $size: {
          $filter: {
            input: "$genders",
            cond: {
              $eq: [
                "$$this",
                "female"
              ]
            }
          }
        }
      },
      total: {
        $size: "$genders"
      }
    }
  }
])

Sample Mongo Playground

db.collection.aggregate([
  {
    "$group": { //Group things
      "_id": "$_id.year",
      "gender": {
        "$addToSet": {
          k: "$_id.gender",
          v: "$total"
        }
      },
      sum: { //Sum it
        $sum: "$total"
      }
    }
  },
  {
    "$project": {//Reshape it
      g: {
        "$arrayToObject": "$gender"
      },
      _id: 1,
      sum: 1
    }
  },
  {
    "$project": { //Reshape it
      _id: 1,
      "g.female": 1,
      "g.male": 1,
      sum: 1
    }
  }
])

Play