MongoDB aggregation: 计算不加入的查找结果

MongoDB aggregation: Counting results of the lookup without joining

我正在处理这个查询:

    customers.aggregate: [
  {$lookup: {
    from: "users",
    localField: "_id",
    foreignField: "customerId",
    as: "users"
  }}, 
  {$lookup: {
    from: "templates",
        let: {localField: "$_id"},
    pipeline: [{
    $match: { $and: [{
      $expr: { $eq: ["$customerId", "$$localField"]}},
        {module: false}]
    }}],
    as: "templates"
  }},
  {$lookup: {
    from: "publications",
    localField: "_id",
    foreignField: "customerId",
    as: "publications"
  }},
 {$lookup: {
    from: "documents",
    let: {localField: "$_id"},
    pipeline: [{
    $match: { $and: [{
      $expr: { $eq: ["$customerId", "$$localField"]}},
        {createdAt: {$gte: {$date: "<someDate>"}}}]
    }}],
    as: "recentDocuments"
  }}
]

在最后一个查找阶段,我根据 _id 字段过滤具有 customerId 字段且更新于 <someDate> 的文档,然后将这些文档加入相应的“客户”目的。 在这一步之后,或者如果可能的话,即使在同一步骤中,我也想为每个生成的“客户”文档添加一个新字段,其中包含来自“文档”的所有文档(不仅是那些通过时间过滤器的文档)的计数" 具有与客户文档的 _id 相对应的 customerId 字段值的集合。而且我也不希望将这些文档加入客户对象,因为我只需要相应 customerId 的文档总数。我只能使用扩展 JSON v1 严格模式语法。 结果如下:

customers: [
 0: {
  users: [...],
  templates: [...],
  publications: [...],
  recentDocuments: [...],
  totalDocuments: <theCountedNumber>
 },
 1: {...},
 2: {...},
 ...
] 

使用$set$size

db.customers.aggregate([
  {
    $lookup: {
      from: "documents",
      let: { localField: "$_id" },
      pipeline: [
        {
          $match: {
            $and: [
              { $expr: { $eq: [ "$customerId", "$$localField" ] } }
            ]
          }
        }
      ],
      as: "recentDocuments"
    }
  },
  {
    $set: {
      totalDocuments: { $size: "$recentDocuments" }
    }
  }
])

mongoplayground

所以在星期四,我找到了解决问题的正确语法。它是这样的:

db.customers.aggregate([
    {
    $lookup: {
    from: "users",
    localField: "_id",
    foreignField: "customerId",
    as: "users"
  }}, 
  {$lookup: {
    from: "templates",
        let: {localField: "$_id"},
    pipeline: [{
    $match: { $and: [{
      $expr: { $eq: ["$customerId", "$$localField"]}},
        {module: false}]
    }}],
    as: "templates"
  }},
  {$lookup: {
    from: "publications",
    localField: "_id",
    foreignField: "customerId",
    as: "publications"
  }},
  {$lookup: {
    from: "documents",
    let: {localField: "$_id"},
    pipeline: [{
    $match: { $and: [{
      $expr: { $eq: ["$customerId", "$$localField"]}},
        {createdAt: {$gte: {$date: "<someDate>"}}}]
    }}],
    as: "recentDocuments"
  },
  {$lookup: {
    from: "documents",
    let: {localField: "$_id"},
    pipeline: [{
    $match: {$and: [{
      $expr: {$eq: ["$customerId", "$$localField"]}},
      { $count: "count" }],
    as: "documentsNumber"
  }}
])

此命令将在聚合管道的最后阶段再次遍历文档集合,但这次将 return 所有文档而不是按时间段过滤,然后交换每个“客户”对象的结果对象,其中一个项目是所有文档的数量。该数组稍后可以使用 $unwind 操作“展开”,但事实证明它会大大降低性能,因此 - 省略。我真的希望这能帮助别人解决类似的问题。