计算最常见的数组元素

Question

我有一堆包含标签数组的文档：

{ tags: ["tag1", "tag2", "tag3"] }

我想做的是计算所有文档中最常用的前 10 个标签。经过反复试验，我想出了以下解决方案：

r.db("database").table("table").concatMap(function(doc) {
  return doc("tags")
}).coerceTo("array").group(function(entry) {
  return entry
}).count().ungroup().orderBy(r.desc("reduction").limit(10).map(function(doc) {
  return doc("group")
})

但是，我 "feel"（我对查询优化的了解有限）认为这是一种相当麻烦的方法。谁能建议一种更有效的正确使用索引的方法？

Answer 1

除了 coerceTo('array') 之外，我认为该查询没问题，我认为这不是必需的，而且可能会影响性能。您也可以将其缩短很多：

r.table('table').group('tags', {multi: true}).count().ungroup().orderBy('reduction').slice(-10)('group')

计算最常见的数组元素

Compute the most common array elements

rethinkdb