MongoDB 基于多个相似值字段的聚合查询

MongoDB aggregation query based on multiple fields with similar values

我有这样的文档:

{
    "_id" : "001",
    "a" : {
        "b" : {
            "c" : {
                "custId" : "cust1"
            },
            "d" : {
                "custId" : "cust2"
            }
        }
    }
}
{
    "_id" : "002",
    "a" : {
        "b" : {
            "c" : {
                "custId" : "cust1"
            },
            "d" : {
                "custId" : "cust3"
            }
        }
    }
}
{
    "_id" : "003",
    "a" : {
        "b" : {
            "c" : {
                "custId" : null
            },
            "d" : {
                "custId" : "cust2"
            }
        }
    }
}
{
    "_id" : "004",
    "a" : {
        "b" : {
            "c" : {
                "custId" : null
            },
            "d" : {
                "custId" : "cust1"
            }
        }
    }
}

我想获得一个显示客户 ID 排序计数的聚合,忽略空客户 ID,如下所示:

{
    "_id" : "cust1",
    "count" : 3,
    "records" : [ 
        "001", "002", "004"
    ]
}
{
    "_id" : "cust2",
    "count" : 2,
    "records" : [ 
        "001", "003"
    ]
}
{
    "_id" : "cust3",
    "count" : 1,
    "records" : [ 
        "002"
    ]
}

我认为每个文档都需要分解成 1 或 2 个基于客户的数组,然后再展开回文档,但我一直无法确定可行的解决方案。

  • 在使用 $objectToArray
  • 从对象转换为数组后,创建 custId$map 的数组以迭代 b 的循环
  • $unwind解构custIds数组
  • $match 过滤 none null custIds 文档
  • $group by custIds 并获取总记录数并使用 $addToset
  • 创建 _id 的唯一数组
db.collection.aggregate([
  {
    $project: {
      custIds: {
        $map: {
          input: { $objectToArray: "$a.b" },
          in: "$$this.v.custId"
        }
      }
    }
  },
  { $unwind: "$custIds" },
  { $match: { custIds: { $ne: null } } },
  {
    $group: {
      _id: "$custIds",
      count: { $sum: 1 },
      records: { $addToSet: "$_id" }
    }
  }
])

Playground