如何在 mongoDB 中展平双数组？

Question

我的 mongoDB 文档中的某些字段如下所示：

{
...
Countries: [["Spain", "France"]]
...
}

或者这样：

{
...
Countries: [["Spain"],["Russia", "Egypt"]]
...
}

我想做的是把[["Spain", "France"]]变成["Spain", "France"]，把[["Spain"],["Russia", "Egypt"]]变成["Spain", "Russia", "Egypt"]，类似于在[=中使用flatten的方法28=].

有没有办法在 mongoDB 中展平数组？我需要展平整个集合中所有文档中的数组，而不仅仅是单个文档，如果这很重要，数组中的值和它们的数量也因文档而异。

我也在使用 Ruby 作为 mongo 的驱动程序，所以使用 Ruby 驱动程序的方法对我也很有用。

Answer 1

试试这个：

db.test2.aggregate([
   {"$unwind" : "$Countries"},
   {"$unwind" : "$Countries"},
   {$group : { _id : '$_id', Countries: { $addToSet: "$Countries" }}},
]).result

Answer 2

您需要使用两个 unwind stages and a single group 阶段执行聚合操作。基本规则是你放松的次数与嵌套深度的水平一样多。这里的嵌套层数是 2，所以我们展开了两次。

 collection.aggregate([
 {$unwind => "$Countries"},
 {$unwind => "$Countries"},
 {$group => {"_id":"$_id","Countries":{$push => "$Countries"}}}
 ])

第一个 $unwind 阶段产生结果：

{
        "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
        "Countries" : [
                "Spain",
                "France"
        ]
}
{
        "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
        "Countries" : [
                "Spain"
        ]
}
{
        "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
        "Countries" : [
                "Russia",
                "Egypt"
        ]
}

第二个 $unwind 阶段进一步扁平化 Countries 数组：

{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "France" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Russia" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Egypt" }

现在最后的 $group 阶段根据 _id 对记录进行分组，并将国家/地区名称累积在一个数组中。

{
        "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
        "Countries" : [
                "Spain",
                "Russia",
                "Egypt"
        ]
}
{
        "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
        "Countries" : [
                "Spain",
                "France"
        ]
}

如果您希望在文档中保留其他字段，则需要使用 $first 运算符明确指定国家字段以外的字段名称（字段 1、字段 2 等） .您可以通过在 $out 阶段指定集合名称来 write/overwrite 集合。

collection.aggregate([
 {$unwind => "$Countries"},
 {$unwind => "$Countries"},
 {$group => {"_id":"$_id","Countries":{$push => "$Countries"},
             "field1":{$first => "$field1"}}},
 {$out => "collection"}
 ])

您需要明确指定字段，这样您就不会得到多余的 Countries 字段。

您可以使用 $$ROOT 系统变量来存储整个文档，但这会使 Countries 字段 redundant.One 在 doc 之外，而在 doc.

collection.aggregate([
 {$unwind => "$Countries"},
 {$unwind => "$Countries"},
 {$group => {"_id":"$_id","Countries":{$push => "$Countries"},
             "doc":{$first => "$$ROOT"}}},
 {$out => "collection"}
 ])

Answer 3

您的国家/地区数据格式不佳，因此您可以考虑转换它们。这是一个脚本，用于展平 Countries 字段中的数组并将其保存为运行中的原始文档 mongo shell:

function flattenArray(inArr) {
    var ret = [];
    inArr.forEach(function(arr) {
        if (arr.constructor.toString().indexOf("Array") > -1) {
           ret = ret.concat(flattenArray(arr));
        } else {
           ret.push(arr);                   
        }
    });
    return ret;
}


db.collection.find({
  'Countries': {
    '$exists': true
  }
}).forEach(function(doc){
  doc.Countries = flattenArray(doc.Countries);
  db.collection.save(doc);
});

Answer 4

在 Mongo 3.4+ 中，您可以使用 $reduce 来展平二维数组。

db.collection.aggregate(
  [
    {
      $project: {
        "countries": {
          $reduce: {
            input: '$Countries',
            initialValue: [],
            in: {$concatArrays: ['$$value', '$$this']}
          }
        }
      }
    }
  ]
)

文档：https://docs.mongodb.com/manual/reference/operator/aggregation/reduce/

如何在 mongoDB 中展平双数组？

How can I flatten double arrays in mongoDB?

ruby

arrays

mongodb