如何找到用户在特定日期拥有的对象数量?

How to find the amount of objects that user have in particular date?

我正在学习 mongo 并且我正在尝试为给定时间范围内的给定用户提供两个指标。准确地说,我需要计算这种代表特定日期背包状态的对象数组:

{ 
  data: [
    { date: '2020-01-01', itemsCount: 1, itemsSize: 5 },
    { date: '2020-01-02', itemsCount: 3, itemsSize: 12 },
    ...

  ]
} 

其中 itemsCount 是所有用户项目的总数,itemsSize 是所有项目的大小总和。

我有一个 mongodb 集合,包含四种类型的事件,结构如下:

{
  type: "backpack.created"    // type of event
  backpackId: 1,
  timestamp: 1604311699,      // timestamp in seconds when event occurred
  ownerId: 1,
  size: 15,                   // sum of sizes of all items located in the backpack
  itemsCount: 5               // number of items in the backpack                    
}
{
  type: "backpack.owner.changed",    
  timestamp: 1604311699, 
  newOwnerId: 2,
  backpackId: 1,                    
}
{
  type: "backpack.deleted",
  backpackId: 1,
  timestamp: 1604311699,               
}
{
  type: "backpack.updated",
  backpackId: 1,
  size: 5,
  itemsCount: 25,
  timestamp: 1604311699,                             
}

解决这个问题的第一个想法是在内存中加载给定用户和时间范围内的所有事件并进行计算,但这对我的记忆来说太糟糕了。所以我想知道如何进行这样的查询来为我提供给定的指标?是否可以使用 mongo 来实现?我不知道如何处理所有权变更。

注意: 当天创建和删除的背包表示当天贡献度为0

我不相信你想做的事情,即每天创建一个 cross-backpack 职位,完全 由 mongodb 管道提供服务。原因是您需要日复一日地跟踪状态,以便在从现在起 3 天后发生 backpack.deleted 事件时,您知道要从 运行 聚合位置中删除多少。

也就是说,mongodb 可以在两个方面帮助您:

  1. 作为一个范围内事件的主过滤器,排除不影响位置的 owner.changed
  2. 日生成器的一个方便的“最后一个事件”。由于update有新的总等级,不是增量的,所以当天最后更新的新位置;如果最后一个事件被删除,那个背包的位置变为零。
var sdate = new ISODate("2020-11-01");
var edate = new ISODate("2020-12-01");

c=db.foo.aggregate([
    // Convert timestamp into something more filterable:                            
    {$addFields: {D: {$toDate: {$multiply:[1000,"$timestamp"]} } }}

    // Use DB to do what it does best: filter!                                      
    ,{$match: {type: {$ne: 'backpack.owner.changed'},
               D: {$gte: sdate, $lt: edate}
              }}

    // Ensure material is coming out date DESCENDING (most recent first)            
    // to properly set up for the $group/$first to follow:                          
    ,{$sort: {D:-1}}

    // Since the timestamps include hours/mins/seconds and we only                  
    // care about day, just turn it into string.  In mongodb 5.0,                   
    // you should use $dateTrunc to set H:H:S to 00:00:00.                          
    ,{$group: {_id: {
        D: {$dateToString: {format: '%Y-%m-%d', date:'$D'}},
        B: '$backpackId'
    }
           // Thanks to the $sort above, regardless of the $group set           
           // ordering of date + backpackId, taking the $first is the           
           // last one for that particular day:                                 
               , Lsize: {$first: '$size'}
               , LitemsCount: {$first: '$itemsCount'}
           , Laction: {$first: '$type'}
              }}

    // Now, group *again* to reorganize the content by date alone.                  
    // This makes it easy for the client to pick up a cursor of                     
    // dates which is the intent of the day-to-day position                         
    // building:                                                                    
    ,{$group: {_id: '$_id.D',
               X: {$push: {B:'$_id.B'
                           , Lsize: '$Lsize'
                           , LitemsCount: '$LitemsCount'
                           , Laction: '$Laction'}
                  }
              }}

    // ...and of course sort by date so the client can easily                       
    // walk forward on the cursor by date:                                          
    ,{$sort: {'_id':1}}
]);

此时你会得到类似这样的结果(此输出中的事件比我测试的 OP 中的事件更多):

{
    "_id" : "2020-11-02",
    "X" : [
        {
            "B" : 3,
            "Lsize" : 3,
            "LitemsCount" : 35,
            "Laction" : "backpack.created"
        },
        {
            "B" : 2,
            "Lsize" : 13,
            "LitemsCount" : 9,
            "Laction" : "backpack.created"
        },
        {
            "B" : 1,
            "Lsize" : 8,
            "LitemsCount" : 28,
            "Laction" : "backpack.updated"
        }
    ]
}
{
    "_id" : "2020-11-03",
    "X" : [
        {
            "B" : 2,
            "Lsize" : 7,
            "LitemsCount" : 11,
            "Laction" : "backpack.updated"
        }
    ]
}
{
    "_id" : "2020-11-04",
    "X" : [
        {
            "B" : 1,
            "Lsize" : null,
            "LitemsCount" : null,
            "Laction" : "backpack.deleted"
        }
    ]
}
{
    "_id" : "2020-11-05",
    "X" : [
        {
            "B" : 3,
            "Lsize" : null,
            "LitemsCount" : null,
            "Laction" : "backpack.deleted"
        }
    ]
}

留给 reader 作为练习来遍历此光标,并且对于每个 date+backpackId,通过 backpackId 累积 sizeitemsCount 的总和.任何时候触发 deleted 事件,当天总和将变为零。要从 all 个背包中获得 sizeitemsCount,只需询问给定日期的所有金额即可。 将聚合逻辑移出 MongoDB 还可以更轻松地表示没有 material 的日期聚合,例如:

    { date: '2020-01-01', itemsCount: 1, itemsSize: 5 },
    { date: '2020-01-02', itemsCount: 0, itemsSize: 0 },
    { date: '2020-01-03', itemsCount: 0, itemsSize: 0 },
    { date: '2020-01-04', itemsCount: 6, itemsSize: 21},
    ...