MongoDB 聚合 $graphLookup - 在以下关系的 collections 中找到共同点 "connections"

MongoDB aggregation $graphLookup - find "connections" in common in collections of following relationships

我有一个collection“谁在关注谁”(比如 Instagram):

db.users.insertMany([
  { _id: 1, name: "Arnold Schwarzenegger" },
  { _id: 2, name: "James Earl Jones" },
  { _id: 3, name: "Harrison Ford" },
  { _id: 4, name: "Jennifer Lawrence" }
]);

db.follows.insertMany([
  { _id: 12, follower: 1, following: 2 },
  { _id: 13, follower: 1, following: 3 },
  { _id: 24, follower: 2, following: 4 },
  { _id: 23, follower: 2, following: 3 }
]);

我正在尝试向其他用户推荐他们可以关注的其他用户。即他们可以关注哪些其他人;建议的关注者,按现有公共连接数排序。

在这个例子中:

+--------+--------------+----------+
|   A    | is following |    B     |
+--------+--------------+----------+
| Arnold | ->           | James    |
| Arnold | ->           | Harrison |
| James  | ->           | Jennifer |
| James  | ->           | Harrison |
+--------+--------------+----------+

阿诺和詹姆斯之间,阿诺能跟谁?(不包括已有人脉)

The answer should be: Jennifer

这是一次糟糕的尝试:

db.users.aggregate([
  {
    $match: { _id: 1 } // Arnold
  },
  {
    $graphLookup: {
      from: "follows",
      startWith: "$_id",
      connectFromField: "following",
      connectToField: "follower",
      maxDepth: 1,
      as: "connections",
    }
  }
]);

这导致:

  {
    "_id": 1,
    "name": "Arnold Schwarzenegger",
    "connections": [
      {
        "_id": 24,
        "follower": 2,
        "following": 4
      },
      {
        "_id": 13,
        "follower": 1,
        "following": 3
      },
      {
        "_id": 23,
        "follower": 2,
        "following": 3
      },
      {
        "_id": 12,
        "follower": 1,
        "following": 2
      }
    ]
  }

我认为我需要做一些 $unwind'ing,但我现在有点卡住了

这里有两种可能的方法。 (我没有用更大的数据集进行测试,所以你的里程可能会有所不同!)

第一个基于您的 $graphLookup 阶段:

db.users.aggregate([
  { $match: { _id: 1 }},
  { $graphLookup: {
    from: 'follows',
    startWith: '$_id',
    connectFromField: 'following',
    connectToField: 'follower',
    maxDepth: 1,
    as: 'connections'
  }},
  { $unwind: { path: '$connections' }},
  { $group: {
    _id: '$connections.follower',
    follows: {
      $addToSet: '$connections.following'
    }
  }},
  { $unwind: { path: '$follows' }},
  { $group: {
    _id: '$follows',
    isFollowedBy: {
      $addToSet: '$_id'
    }
  }},
  { $match: { isFollowedBy: { $not: { $in: [1] }} }},
  { $group: {
    _id: null,
    newConnections: {
      $addToSet: '$_id'
    }
  }},
  { $project: { _id: 0 }}
])

请注意,此管道最终会在中途建立与另一个集合的关系,因此另一种方法是从另一个集合开始,如下所示:

db.follows.aggregate([
  { $lookup: {
    from: 'follows',
    localField: 'following',
    foreignField: 'follower',
    as: 'potentialSet'
  }},
  { $unwind: {
    path: "$potentialSet",
    preserveNullAndEmptyArrays: true
  }},
  { $group: {
    _id: "$follower",
    "alreadyFollowing": {
      $addToSet: "$following"
    },
    "potentialConnections": {
      "$addToSet": "$potentialSet.following"
    }
  }},
  { $project: {
    newConnections: { $setDifference: [ "$potentialConnections", "$alreadyFollowing" ] }
  }},
  { $match: { _id: 1 }},
  { $project: { _id: 0 }}
])

如果有帮助,我使用 MongoDB Compass Community Edition 来帮助构建这些管道。它非常酷,因为它允许您快速迭代并查看每个阶段的输出,这在您尝试调试管道时非常有用。