访问 $filter 聚合中的根文档映射 (MongoDb)

Question

对于模糊的问题描述，我深表歉意，但我有一个关于 MongoDB 聚合中的过滤的相当复杂的问题。请查看我的数据模式以更好地理解问题：

Company {
  _id: ObjectId
 name: string
}

License {
  _id: ObjectId
  companyId: ObjectId
  userId: ObjectId
}

User {
  _id: ObjectId
  companyId: ObjectId
  email: string
}

目标：

我想查询所有非授权用户。为此，您需要这些简单的 MongoDB 查询：

const licenses = db.licenses.find({ companyId }); // Get all licenses for specific company
const userIds = licenses.toArray().map(l => l.userId); // Collect all licensed user ids

const nonLicensedUsers = db.users.find({ _id: { $nin: userIds } }); // Query all users that don't hold a license

问题：

上面的代码工作得很好。但是，在我们的系统中，企业可能有几十万用户。因此，第一步和最后一步变得异常昂贵。我会详细说明这一点。首先，您需要从数据库中获取大量文档并通过网络传输，这是相当昂贵的。然后，我们需要再次通过网络向 MongoDB 传递一个巨大的 $nin 查询，这会使开销成本翻倍。

所以，我想在MongoDB端和return一小部分非授权用户执行所有提到的操作，以避免网络传输成本。有关于如何实现这一目标的想法吗？

我能够使用以下聚合（伪代码）非常接近：

db.company.aggregate([
  { $match: { _id: id } }, // Step 1. Find the company entity by id
  { $lookup: {...} }, // Step 2. Joins 'users' collection by `companyId` field
  { $lookup: {...} }, // Step 3. Joins 'licenses' collection by `companyId` field
  { 
    $project: {
      licensesMap: // Step 4. Convert 'licenses' array to the map with the shape { 'user-id': true }. Could be done with $arrayToObject operator
    }
  },
  {
    $project: {
       unlicensedUsers: {
             $filter: {...} // And this is the place, where I stopped
          }
     }
  }
]);

让我们仔细看看上面聚合的最后阶段。我尝试通过以下方式使用 $filter 聚合：

{
    $filter: {
     input: "$users"
     as: "user",
     cond: {
       $neq: ["$licensesMap[$$user._id]", true]
     }
  }
}

但是，不幸的是，这没有用。似乎 MongoDB 没有应用插值，只是试图将原始 "$licensesMap[$$user._id]" 字符串与 true 布尔值进行比较。

注意#1：

很遗憾，我们无法更改当前的数据架构。这对我们来说代价高昂。

注意#2：

我没有在上面的聚合示例中包含它，但我确实将 Mongo 对象 ID 转换为字符串以便能够创建 licensesMap。而且，我将 users 列表的 ID 字符串化，以便能够正确访问 licensesMap。

示例数据：

公司合集：

[
  { _id: "1", name: "Acme" }
]

许可证集合

[
  { _id: "1", companyId: "1", userId: "1" },
  { _id: "2", companyId: "1", userId: "2" }
]

用户合集：

[
  { _id: "1", companyId: "1" },
  { _id: "2", companyId: "1" },
  { _id: "3", companyId: "1" },
  { _id: "4", companyId: "1" },
]

预期结果为：

[
  _id: "1", // company id
  name: "Acme",
  unlicensedUsers: [
    { _id: "3", companyId: "1" },
    { _id: "4", companyId: "1" },
  ]
]

解释：unlicensedUsers 列表包含第三个和第四个用户，因为他们在 licenses 集合中没有相应的条目。

Answer 1

像这样简单的东西怎么样：

db.usersCollection.aggregate([
  {
    $lookup: {
      from: "licensesCollection",
      localField: "_id",
      foreignField: "userId",
      as: "licensedUsers"
    }
  },
  {$match: {"licensedUsers.0": {$exists: false}}},
  {
    $group: {
      _id: "$companyId",
      unlicensedUsers: {$push: {_id: "$_id", companyId: "$companyId"}}
    }
  },
  {
    $lookup: {
      from: "companiesCollection",
      localField: "_id",
      foreignField: "_id",
      as: "company"
    }
  },
  {$project: {unlicensedUsers: 1, company: {$arrayElemAt: ["$company", 0]}}},
  {$project: {unlicensedUsers: 1, name: "$company.name"}}
])

playground example

users 集合和 licenses 集合，两者都有用户需要的任何东西，所以在第一个 $lookup“合并”它们之后，一个简单的 $match只保留未经许可的用户，剩下的只是格式化为您请求的格式。

好处：此解决方案适用于任何类型的 ID。例如playground

Answer 2

如果您遇到类似情况。请记住，上述解决方案仅适用于 hashed 索引。

访问 $filter 聚合中的根文档映射 (MongoDb)

Access root document map in the $filter aggregation (MongoDb)

filtering

mongodb

mongodb-query

aggregation-framework