如何在 mongodb 中按类别汇总标签

How do I summarize tags by category in mongodb

我有一个 collection 形状如下:

[
    {
        _id: ObjectId("5d8e8c9b8f8b9b7b7a8b4567"),
        tags: {
            language: [ 'en' ],
            industries: [ 'agency', 'travel' ],
            countries: [ 'ca', 'us' ],
            regions: [ 'north-america' ],
        }
    },
    {
        _id: ObjectId("5d8e8c9b8f8b9b7b7a8b4568"),
        tags: {
            language: [ 'en', 'fr' ],
            industries: [ 'travel' ],
            countries: [ 'ca' ]
        }
    },
    {
        _id: ObjectId("5d8e8c9b8f8b9b7b7a8b4569"),
        tags: {
            language: [ 'en' ],
            industries: [ 'agency', 'travel' ],
            countries: [ 'ca', 'us' ],
            regions: [ 'south-america' ]
        }
    },
]

我想生成这个结果...

{
    //* count of all documents
    "count": 3,
    //* count of all documents that contain any slug within the given category
    "countWithCategorySlug": {
        "language": 3,
        "industries": 3,
        "countries": 3,
        "regions": 2
    },
    //* per category: count of documents that contain that slug in the givin category
    "language" {
        "en": 3,
        "fr": 1
    },
    "industries" {
        "agency": 2,
        "travel": 3,
    },
    "countries" {
        "ca": 3,
        "us": 2
    },
    "regions" {
        "north-america": 1,
        "south-america": 1
    }
}

超级卡住,所以任何帮助将不胜感激。 :)

类别数量未知,我有一个代码解决方案查询不同类别和 slug 的列表,然后为每个生成一个 $group 阶段...结果查询太大,需要一个更好的方法...问题是我完全不知道如何优化它...

查询

  • 完成构面之前的第一部分将它们分开并为每个值制作 1 个文档,如
  [{
  "type": "language",
  "value": "en",
  "_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
  "type": "industries",
  "value": "agency",
  "_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
  "type": "industries",
  "value": "travel",
  "_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
  "type": "countries",
  "value": "ca",
  "_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
}]
  • 然后用 3 个字段分面并计算文档
  • 并在转换之后获得与预期输出类似的键数据

Playmongo

ggregate(
[{"$set": {"tags": {"$objectToArray": "$tags"}}},
 {"$set": 
   {"tags": 
     {"$map": 
       {"input": "$tags",
        "in": {"type": "$$this.k", "value": "$$this.v", "_id": "$_id"}}}}},
 {"$unwind": "$tags"},
 {"$replaceRoot": {"newRoot": "$tags"}},
 {"$unwind": "$value"},
 {"$facet": 
   {"count": 
     [{"$group": {"_id": null, "count": {"$addToSet": "$_id"}}},
       {"$set": {"count": {"$size": "$count"}}}],
    "category": 
     [{"$group": {"_id": "$type", "count": {"$addToSet": "$_id"}}},
       {"$set": {"count": {"$size": "$count"}}}],
    "values": 
     [{"$group": 
         {"_id": "$value",
          "type": {"$first": "$type"},
          "values": {"$addToSet": "$_id"}}},
       {"$set": {"values": {"$size": "$values"}}},
       {"$group": 
         {"_id": "$type",
          "values": 
           {"$push": 
             {"type": "$type", "value": "$_id", "count": "$values"}}}}]}},
 {"$set": 
   {"count": 
     {"$getField": 
       {"field": "count", "input": {"$arrayElemAt": ["$count", 0]}}},
    "category": 
     {"$arrayToObject": 
       [{"$map": 
           {"input": "$category",
            "in": {"k": "$$this._id", "v": "$$this.count"}}}]},
    "values": 
     {"$arrayToObject": 
       [{"$map": 
           {"input": "$values",
            "in": 
             {"k": "$$this._id",
              "v": 
               {"$arrayToObject": 
                 [{"$map": 
                     {"input": "$$this.values",
                      "in": {"k": "$$this.value", "v": "$$this.count"}}}]}}}}]}}}])

产出

[{
  "count": 3,
  "category": {
    "countries": 3,
    "industries": 3,
    "regions": 2,
    "language": 3
  },
  "values": {
    "regions": {
      "south-america": 1,
      "north-america": 1
    },
    "countries": {
      "us": 2,
      "ca": 3
    },
    "language": {
      "fr": 1,
      "en": 3
    },
    "industries": {
      "agency": 2,
      "travel": 3
    }
  }
}]