mongodb 主串中子串出现次数的聚合

Question

我是 MongoDB 的新手，可能是个菜鸟问题。

我想计算消息字段中 "lupoK" 重复的次数 - "message" : "first lupoK lupoK" 在 MongoDB 中使用聚合，我使用的是 studio3t 接口。

我的文档结构是 -

{ 
    "_id" : ObjectId("5df9c780b05196da93be262b"), 
    "id" : "61a4c53a-aa99-4336-ab4f-07bb7f618889", 
    "time" : "00:00:45", 
    "username" : "siul", 
    "message" : "***first lupoK lupoK***", 
    "emoticon_place" : [
        {
            "_id" : "128428", 
            "begin" : NumberInt(6), 
            "end" : NumberInt(10)
        }
    ], 
    "fragments" : [
        {
            "text" : "first "
        }, 
        {
            "emoticon" : {
                "emoticon_id" : "128428", 
                "emoticon_set_id" : ""
            }, 
            "text" : "***lupoK***"
        },
        {
            "emoticon" : {
                "emoticon_id" : "128428", 
                "emoticon_set_id" : ""
            }, 
            "text" : "***lupoK***"
        }
    ]
}

提前致谢！！！

Answer 1

这适用于 mongo shell（假设 message 字段是一个字符串并且存在）：

db.test.aggregate( [
  { 
      $project: { 
          _id: 0, 
          message: 1, 
          count: { 
              $subtract: [ 
                  { $size: { $split: [ "$message", "lupoK" ] } }, 1 
              ] 
          } 
      } 
  }
] )

备注：

$split 操作根据定界符拆分消息字符串 - 在这种情况下，定界符是 "lupoK"。拆分 returns 由 "lupoK" 分隔的标记数组。因此，令牌数减去 1，得到使用次数 "lupoK"，即 "lupoK".

的出现次数

使用这些示例消息字符串检查结果：

"***first lupoK lupoK***"
"lupoKlupoK"
" lupoK lupoK "
""
"lupoKlupoKlupoK"
"lupoK"
"HELLO * lupoK* WORLD"
"HELLO WORLD"
"***first lupoK lupoKlupoK lupoK***lupoK *** last lupoK."

例如，某些字符串的标记：

"***first lupoK lupoK***" 生成这三个标记：[ "***first", " ", "***" ]
"HELLO * lupoK* WORLD" 有这两个标记：[ "HELLO * ", "* WORLD" ]
"***first lupoK lupoKlupoK lupoK***lupoK *** last lupoK." 有七个标记：[ "***first ", " ", "", " ", "***", " ***last ", "." ]

mongodb 主串中子串出现次数的聚合

Aggregation for counting the occurrence of sub-string in main string in mongodb

aggregation

mongodb

studio3t