MongoDB 如何在数组中对 objects 进行分组并找到前 3 个
MongoDB how to group objects in array and find the top 3
我想找出用户评论最多的前 3 个位置(如果有的话)。
更具体地说,我有一个 collection 评论,其中包含 user_id
和 business_id
以及一个 collection 业务,其中包含 business_id
和经度以及纬度。如果用户从位置(38.551126,-110.880452)和(38.999999,-110.000000)进行了评论,
我们可以说用户在此位置 (38, -110) 有 2 条评论。
Collection.review:
{
"review_id": "KU_O5udG6zpxOg-VcAEodg",
"user_id": "mh_-eMZ6K5RLWhZyISBhwA",
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"stars": 3,
"useful": 0,
"funny": 0,
"cool": 0,
"text": "If you decide to eat here, ...",
"date": "2018-07-07 22:09:11"
}
collection.business:
{
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"name": "Turning Point of North Wales",
"address": "1460 Bethlehem Pike",
"city": "North Wales",
"state": "PA",
"postal_code": "19454",
"latitude": 40.2101961875,
"longitude": -75.2236385919,
"stars": 3,
"review_count": 169,
"is_open": 1,
"attributes": Object
"categories": "Restaurants, Breakfast & Brunch, Food",
"hours": Object
}
我是 mongodb 的初学者,我唯一有一些结果的管道就是这个。
review.aggregate([{
$match: {
user_id: {
$in: [some_list]
}
}
}, {
$lookup: {
from: 'business',
localField: 'business_id',
foreignField: 'business_id',
as: 'business'
}
}, {
$unwind: {
path: '$business'
}
}, {
$group: {
_id: '$user_id',
coordinates: {
$push: {
latitude: {
$toInt: '$business.latitude'
},
longitude: {
$toInt: '$business.longitude'
}
}
},
places: {
$sum: 1
}
}
}])
输出
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
coordinates: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 39
longitude: -119
2: Object
latitude: 39
longitude: -119
3: Object
latitude: 39
longitude: -119
4: Object
latitude: 39
longitude: -119
places: 5
之后我在 python 中处理结果。有什么办法可以立即从管道中完成
并得到类似这样的结果(并且仅限于前 3 名)
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
top_places: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 23
longitude: 56
您只需要添加一个额外的分组阶段,先按位置和用户分组,然后再按用户分组,如下所示:
db.review.aggregate([
{
$lookup: {
from: "business",
localField: "business_id",
foreignField: "business_id",
as: "business"
}
},
{
$unwind: {
path: "$business"
}
},
{
$group: {
_id: {
user: "$user_id",
latitude: {
$toInt: "$business.latitude"
},
longitude: {
$toInt: "$business.longitude"
}
},
places: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.user",
coordinates: {
$push: {
latitude: "$_id.latitude",
longitude: "$_id.longitude",
}
},
places: {
$sum: "$places"
}
}
}
])
我想找出用户评论最多的前 3 个位置(如果有的话)。
更具体地说,我有一个 collection 评论,其中包含 user_id
和 business_id
以及一个 collection 业务,其中包含 business_id
和经度以及纬度。如果用户从位置(38.551126,-110.880452)和(38.999999,-110.000000)进行了评论,
我们可以说用户在此位置 (38, -110) 有 2 条评论。
Collection.review:
{
"review_id": "KU_O5udG6zpxOg-VcAEodg",
"user_id": "mh_-eMZ6K5RLWhZyISBhwA",
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"stars": 3,
"useful": 0,
"funny": 0,
"cool": 0,
"text": "If you decide to eat here, ...",
"date": "2018-07-07 22:09:11"
}
collection.business:
{
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"name": "Turning Point of North Wales",
"address": "1460 Bethlehem Pike",
"city": "North Wales",
"state": "PA",
"postal_code": "19454",
"latitude": 40.2101961875,
"longitude": -75.2236385919,
"stars": 3,
"review_count": 169,
"is_open": 1,
"attributes": Object
"categories": "Restaurants, Breakfast & Brunch, Food",
"hours": Object
}
我是 mongodb 的初学者,我唯一有一些结果的管道就是这个。
review.aggregate([{
$match: {
user_id: {
$in: [some_list]
}
}
}, {
$lookup: {
from: 'business',
localField: 'business_id',
foreignField: 'business_id',
as: 'business'
}
}, {
$unwind: {
path: '$business'
}
}, {
$group: {
_id: '$user_id',
coordinates: {
$push: {
latitude: {
$toInt: '$business.latitude'
},
longitude: {
$toInt: '$business.longitude'
}
}
},
places: {
$sum: 1
}
}
}])
输出
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
coordinates: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 39
longitude: -119
2: Object
latitude: 39
longitude: -119
3: Object
latitude: 39
longitude: -119
4: Object
latitude: 39
longitude: -119
places: 5
之后我在 python 中处理结果。有什么办法可以立即从管道中完成 并得到类似这样的结果(并且仅限于前 3 名)
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
top_places: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 23
longitude: 56
您只需要添加一个额外的分组阶段,先按位置和用户分组,然后再按用户分组,如下所示:
db.review.aggregate([
{
$lookup: {
from: "business",
localField: "business_id",
foreignField: "business_id",
as: "business"
}
},
{
$unwind: {
path: "$business"
}
},
{
$group: {
_id: {
user: "$user_id",
latitude: {
$toInt: "$business.latitude"
},
longitude: {
$toInt: "$business.longitude"
}
},
places: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.user",
coordinates: {
$push: {
latitude: "$_id.latitude",
longitude: "$_id.longitude",
}
},
places: {
$sum: "$places"
}
}
}
])