'join' 一个键上的两个 MongoDB 集合,没有创建嵌套结构

'join' two MongoDB collections on a key, without creating a nested structure

我想 'join' 一个键上的两个 MongoDB 集合,而不创建嵌套结构。

代码:

# Init client
client_mongo = pymongo.MongoClient(
    host="mongo", port=27017, username="user", password="pass"
)

# Use db gen
db = client_mongo.gen

# Create collections
db.create_collection("wikipedia")
db.create_collection("twitter")

# Add sample record to collection twitter
db.twitter.insert_one({'_id': "1527275059001274368",
 'name': 'Nike',
 'username_twitter': 'nikestore',
 'user_id_twitter': 17351972,
 'text_twitter': '@sillspb Hey, Paul. What error message are you seeing on your end?'})

# Add sample record to collection wikipedia
db.wikipedia.insert_one({'_id': '6286417074ef92666893148a',
 'name': 'Nike',
 'page_wikipedia': 'Nike,_Inc.',
 'text_wikipedia': 'Nike, Inc. ( or ) is an American multinational corporation that is engaged in the design, development, manufacturing, and worldwide marketing and sales of footwear, apparel, equipment, accessories, and services.'})

我想(做等同于)加入键“name”并创建一个名为“twitter_wikipedia”的新集合,没有嵌套结构。

具有嵌套结构的当前解决方案:

db.twitter.aggregate(
    [
        {
            "$lookup": {
                "from": "wikipedia",  # other table name
                "localField": "name",  # key field in collection 2
                "foreignField": "name",  # key field in collection 1
                "as": "linked_collections",  # alias for resulting table
            }
            
        },
        {
        '$out': 'twitter_wikipedia'
    }
    ])

当前输出

db.twitter_wikipedia.find_one({})
{'_id': '1527275059001274368',
 'name': 'Nike',
 'username_twitter': 'nikestore',
 'user_id_twitter': 17351972,
 'text_twitter': '@sillspb Hey, Paul. What error message are you seeing on your end?',
 'linked_collections': [{'_id': '6286417074ef92666893148a',
   'name': 'Nike',
   'page_wikipedia': 'Nike,_Inc.',
   'text_wikipedia': 'Nike, Inc. ( or ) is an American multinational corporation that is engaged in the design, development, manufacturing, and worldwide marketing and sales of footwear, apparel, equipment, accessories, and services.'}]}

# Desired output
{'name': 'Nike',
 'username_twitter': 'nikestore',
 'user_id_twitter': 17351972,
 'text_twitter': '@sillspb Hey, Paul. What error message are you seeing on your end?',
 'page_wikipedia': 'Nike,_Inc.',
 'text_wikipedia': 'Nike, Inc. ( or ) is an American multinational corporation that is engaged in the design, development, manufacturing, and worldwide marketing and sales of footwear, apparel, equipment, accessories, and services.'}

在完成 $lookup

之后,只需执行简单的 $unwind$project
db.twitter.aggregate([
  {
    "$lookup": {
      "from": "wikipedia",
      "localField": "name",
      "foreignField": "name",
      "as": "linked_collections"
    }
  },
  {
    "$unwind": "$linked_collections"
  },
  {
    "$project": {
      "name": 1,
      "username_twitter": 1,
      "user_id_twitter": 1,
      "text_twitter": 1,
      "page_wikipedia": "$linked_collections.page_wikipedia",
      "text_wikipedia": "$linked_collections.text_wikipedia"
    }
  },
  {
    $out: "twitter_wikipedia"
  }
])

这里是Mongo Playground供您参考。