MongoDB 聚合以获取计数和 Y 个样本条目

MongoDB Aggregation to get count and Y sample entries

MongoDB version:4.2.17.

正在尝试对集合中的数据进行聚合。

示例数据:

 {
        "_id" : "244",
        "pubName" : "p1",
        "serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
        "serviceName" : "my-service",
        "subName" : "c1",
        "pubState" : "INVITED"
 }

我愿意:

通过某些东西(比如 subName)和 group by serviceIdRef 进行匹配,然后 limit 到 return X 个条目 另外 return 对于每个 serviceIdRefs,每个 ACTIVE 中的文档的 count ]INVITED 状态。 Y(对于本例,假设 Y=3)处于此状态的文档。 例如,输出将显示为(简而言之):

[
    {
        serviceIdRef: "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
        serviceName:
        state:[
            {
                pubState: "INVITED"   
                count: 200
                sample: [ // Get those Y entries (here Y=3)
                    {
                        // sample1 like:
                        "_id" : "244",
                        "pubName" : "p1",
                        "serviceIdRef" : "36e9c779-7865-4b74-a30b-e4d6a0cc5295",
                        "serviceName" : "my-service",
                        "subName" : "c1",
                        "pubState" : "INVITED"

                    },
                    {
                        sample2
                    },
                    {
                        sample3
                    }
                ]
            },
            {
                pubState: "ACTIVE", // For this state, repeat as we did for "INVITED" state above.
                ......
            }
        ]
    }
    {
        repeat for another service
    }
]

到目前为止,我已经写了这篇文章,但无法获得那些 Y 条目。有没有(更好的)方法?

这是我目前所拥有的(不完整且不完全按照上述格式输出):

db.sub.aggregate(
    [{
        $match:
        {
            "subName": {
                $in: ["c1", "c2"]
    
            },
            
            "$or": [
                {
                    "pubState": "INVITED",
                },
                {
                    "pubState": "ACTIVE",
                }
            ]
        }
    },
    {
        $group: {
            _id: "$serviceIdRef",
            subs: {
                $push: "$$ROOT",
    
            }
    
        }
    },
    {
        $sort: {
            _id: -1,
        }
    },
    {
        $limit: 22
    },
    {
       $facet:
        {
            facet1: [
                {
                    $unwind: "$subs",
                },
                {
                    $group:
                    {
                        _id: {
                            "serviceName" : "$_id",
                            "pubState": "$subs.pubState",
                            "subState": "$subs.subsState"
                        },
                        count: {
                            $sum: 1
                        }
                            
                    }
                }
            ]
        }
    }
    
    ])
    

您必须执行第二个 $group 阶段来管理嵌套结构,

  • $match你的条件
  • $sort_id 降序排列
  • $group 通过 serviceIdRefpubState,获取第一个必填字段并为 sample 准备数组,并获取文档数
  • $group 仅由 serviceIdRef 构造 state 数组
  • $slice 用于限制文档在 sample
db.collection.aggregate([
  {
    $match: {
      subName: { $in: ["c1", "c2"] },
      pubState: { $in: ["INVITED", "ACTIVE"] }
    }
  },
  { $sort: { _id: -1 } },
  {
    $group: {
      _id: {
        serviceIdRef: "$serviceIdRef",
        pubState: "$pubState"
      },
      serviceName: { $first: "$serviceName" },
      sample: { $push: "$$ROOT" },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.serviceIdRef",
      serviceName: { $first: "$serviceName" },
      state: {
        $push: {
          pubState: "$_id.pubState",
          count: "$count",
          sample: { $slice: ["$sample", 22] }
        }
      }
    }
  }
])

Playground