MongoDB 聚合大型文档

MongoDB Aggregate on huge documents

我正在尝试自学 mongoDb 大 mongoDB(每个文档大约 10Mb,总共 1000 个文档)

我想尝试一些基础知识。例如,列出所有用户完成的每个 Activity,按 UsedCallories 对其进行排序。

db.getCollection('users').aggregate([
  {$group: {_id:"$Activities"}}, 
  {$sort: { UsedCallories: -1}}
],{allowDiskUse:true});

不幸的是,当我执行这个脚本时,它给我:'Script executed successfully, but there are no results to show.'?

你能指出我哪里错了吗?

缩短的示例文件:

{
  "Id": 1,
  "FirstName": "Casie",
  "LastName": "Crapo",
  "Email": "Casie.Crapo@databanken.db",
  "Weight": 92,
  "Length": 198,
  "Activities": [
    {
      "ActivityType": {
        "Name": "Sexual Activity",
        "CallPerSecond": 0.033333333
      },
      "StartCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "EndCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "StartDateTime": { $date: "2016-11-01T23:39:15Z" },
      "EndDateTime": { $date: "2016-11-02T02:38:45Z" },
      "UsedCallories": 772.63042705630426,
      "Measurements": [
        {
          "Heartrate": 142,
          "UnderPressure": 123,
          "Overressure": 156,
          "Speed": 0,
          "Coordinates": {
            "Lattidude": -10.81907,
            "Longitude": -16.16832
          }
        }
      ]
    }
  ]
}

更新'ExpectedOutput':

所以预期的输出只是用户所有数组字段中所有活动的列表。按 UsedCallories 排序。

"Activities": [
    {
      "ActivityType": {
        "Name": "Sexual Activity",
        "CallPerSecond": 0.033333333
      },
      "StartCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "EndCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "StartDateTime": { $date: "2016-11-01T23:39:15Z" },
      "EndDateTime": { $date: "2016-11-02T02:38:45Z" },
      "UsedCallories": 772.63042705630426,
      "Measurements": [
        ...
      ]
    },{
      "ActivityType": {
        "Name": "Sexual Activity",
        "CallPerSecond": 0.033333333
      },
      "StartCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "EndCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "StartDateTime": { $date: "2016-11-01T23:39:15Z" },
      "EndDateTime": { $date: "2016-11-02T02:38:45Z" },
      "UsedCallories": 52.63042705630426,
      "Measurements": [
        ...
      ]
    },{
      "ActivityType": {
        "Name": "Sexual Activity",
        "CallPerSecond": 0.033333333
      },
      "StartCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "EndCoordinates": {
        "Lattidude": -10.81907,
        "Longitude": -16.16832
      },
      "StartDateTime": { $date: "2016-11-01T23:39:15Z" },
      "EndDateTime": { $date: "2016-11-02T02:38:45Z" },
      "UsedCallories": 20.22442,
      "Measurements": [
        ...
      ]
    }
  ]

重复问题后更新

好的,感谢您引用副本 post。这不是同一个问题。

我设法使用其中的一些实际得到了一些结果。 查询更改为:

db.getCollection('users').aggregate([
    {$unwind: '$Activities'}, 
    {$sort: {'Activities.UsedCallories': -1}}, 
    {$group: {_id: '$_id', 'Activities': {$push: '$Activities'}}}
    ], {
  allowDiskUse:true
 })

现在 returns 所有活动都按用户分组,我希望只列出所有这些未按用户分组的活动

谢谢@chridam。添加我的评论作为答案。

db.getCollection('users').aggregate([{
    $unwind: "$Activities"
}, {
    $sort: {
        "Activities.UsedCallories": -1
    }
}, {
    $group: {
        _id: null,
        Activities: {
            $push: "$Activities"
        }
    }
}, {
    $project: {
        _id: 0,
        Activities: 1
    }
}], {
    allowDiskUse: true
});