使用按 2 个 ID 分组的聚合查询创建结构

Creating a structure using an aggregation query that groups by 2 ids

我收集了各种文档,类似于下面显示的 3 个对象。

{
comment:{ 
      text_sentiment: "positive",
      topic: "A"
   }
}, // DOC-1

{
comment:{ 
      text_sentiment: "negative",
      topic: "A"
}}, // DOC-2

{
comment:{ 
      text_sentiment: "positive",
      topic: "B"
}},..//DOC-3 .. 

我想编写一个 returns 导致以下结构的聚合:

{
   topic: "A",
   topicOccurance: 2,
   sentiment: {
      positive: 3,
      negative: 2,
      neutral: 0
   }

},...

我已经编写了一个能够对 topictext_sentiment 进行分组的聚合,但我不知道如何创建类似于上面显示的结构。这是我创建的聚合。

   db.MyCollection.aggregate({
       $match: {
           _id: "xyz",
           "comment.topic": {$exists: 1},
       }
   },{
       $group: {
           _id: {
               topic: "$comment.topic",
               text_sentiment: "$comment.text_sentiment"
               
           },
           total: {$sum: 1},
           
       }
   },{
       $project: {
           topic: {
               name: "$_id.topic",
               occurence: "$total"
           },
           sentiment: "$_id.text_sentiment"
       }
   },{
       $sort: {"topic.occurence": -1}
   })

topicsentiment分组,但结构与上面的不符。我怎样才能得到类似的结构?

回答 1

您需要 2 $group 个阶段。

  1. $match
  2. $group - 按 comment.topiccomment.topic$sum.
  3. 分组
  4. $group - 按 _id.topic$sum 分组;并通过 $push.
  5. 将前一阶段的 text_sentimenttotal 添加到 text_sentiments
  6. $project - 修饰输出文档。通过 $arrayToObject.
  7. text_sentiments 数组转换为 key-value 对来设置 sentiment
  8. $sort
db.collection.aggregate([
  {
    $match: {
      _id: "xyz",
      "comment.topic": {
        $exists: 1
      },
      
    }
  },
  {
    $group: {
      _id: {
        topic: "$comment.topic",
        text_sentiment: "$comment.text_sentiment"
      },
      total: {
        $sum: 1
      },
      
    }
  },
  {
    $group: {
      _id: "$_id.topic",
      total: {
        $sum: 1
      },
      text_sentiments: {
        $push: {
          k: "$_id.text_sentiment",
          v: "$total"
        }
      }
    }
  },
  {
    $project: {
      topic: "$_id",
      topicOccurance: "$total",
      sentiment: {
        "$arrayToObject": "$text_sentiments"
      }
    }
  },
  {
    $sort: {
      "topicOccurance": -1
    }
  }
])

Sample Mongo Playground (Answer 1)


回答2

如上所述 text_sentiment 值是固定的,您可以使用以下查询:

db.collection.aggregate([
  {
    $match: {
      _id: "xyz",
      "comment.topic": {
        $exists: 1
      },
      
    }
  },
  {
    $group: {
      _id: "$comment.topic",
      total: {
        $sum: 1
      },
      text_sentiments: {
        $push: "$comment.text_sentiment"
      }
    }
  },
  {
    $project: {
      topic: "$_id",
      topicOccurance: "$total",
      sentiment: {
        "positive": {
          $reduce: {
            input: "$text_sentiments",
            initialValue: 0,
            in: {
              $sum: [
                "$$value",
                {
                  "$cond": {
                    "if": {
                      $eq: [
                        "$$this",
                        "positive"
                      ]
                    },
                    "then": 1,
                    "else": 0
                  }
                }
              ]
            }
          }
        },
        "negative": {
          $reduce: {
            input: "$text_sentiments",
            initialValue: 0,
            in: {
              $sum: [
                "$$value",
                {
                  "$cond": {
                    "if": {
                      $eq: [
                        "$$this",
                        "negative"
                      ]
                    },
                    "then": 1,
                    "else": 0
                  }
                }
              ]
            }
          }
        },
        "neutral": {
          $reduce: {
            input: "$text_sentiments",
            initialValue: 0,
            in: {
              $sum: [
                "$$value",
                {
                  "$cond": {
                    "if": {
                      $eq: [
                        "$$this",
                        "neutral"
                      ]
                    },
                    "then": 1,
                    "else": 0
                  }
                }
              ]
            }
          }
        }
      }
    }
  },
  {
    $sort: {
      "topicOccurance": -1
    }
  }
])

缺点:当text_sentiment值为added/removed时,则需要修改查询。

Sample Mongo Playground (Answer 2)


回答 3

另一种类似于答案 2 的方法是使用 $size$filter 替换 $reduce

db.collection.aggregate([
  {
    $match: {
      //_id: "xyz",
      "comment.topic": {
        $exists: 1
      },
      
    }
  },
  {
    $group: {
      _id: "$comment.topic",
      total: {
        $sum: 1
      },
      text_sentiments: {
        $push: "$comment.text_sentiment"
      }
    }
  },
  {
    $project: {
      topic: "$_id",
      topicOccurance: "$total",
      sentiment: {
        "positive": {
          $size: {
            $filter: {
              input: "$text_sentiments",
              cond: {
                $eq: [
                  "$$this",
                  "positive"
                ]
              }
            }
          }
        },
        "negative": {
          $size: {
            $filter: {
              input: "$text_sentiments",
              cond: {
                $eq: [
                  "$$this",
                  "negative"
                ]
              }
            }
          }
        },
        "neutral": {
          $size: {
            $filter: {
              input: "$text_sentiments",
              cond: {
                $eq: [
                  "$$this",
                  "neutral"
                ]
              }
            }
          }
        },
        
      }
    }
  },
  {
    $sort: {
      "topicOccurance": -1
    }
  }
])

Sample Mongo Playground (Answer 3)