如何通过给定参数过滤文档内部来获取聚合值
How to get aggregate value by filtering inside of a document by a given parameter
我有一个类似于以下数据集的集合,它被命名为 useragents
。
我有一个用例,用于查找每个 useagents
中值的总和。在这种情况下,作为示例,我使用 useragents
作为 Linux 和 Ubuntu OS。它可以是动态的。作为我的第一步,我找到了使用聚合框架获取每个用户代理的聚合总和值的解决方案。
请参考这个。
但我想通过根据给定的参数列表检查每个 venuelist
、ssidlist
、maclist
来汇总值。这对我来说是一个非常困难的问题,因为有时我的数据结构可能很复杂。
我想在给定以下参数的情况下获得每个用户代理(linux、ubuntu)的总和:
parameterlist 1
venueid :: [VID001, VID002] // this is compulsory field in parameter list
ssids : [SSID001] // this is optional filed in parameter list
mac : [22:22:22:22:22:22]
output
linux: 12 + 2 = 14
ubuntu : 2 + 5 = 7
parameterlist 2
venueid :: [VID001, VID002] // this is compulsory field in parameter list
mac : [22:22:22:22:22:22] // this is optional filed in parameter list
output
linux: 12 + 4 + 2 = 16
ubuntu : 2 + 2 + 5 = 7
这是示例数据集
{
"_id" : ObjectId("57f940c4932a00aba387b0b0"),
"tenantID" : 1,
"date" : "2016-10-09 00:23:56",
"venueList" : [
{
"id" : “VID001”,
"sum" : [
{
"name" : "linux",
"value" : 16
},
{
"name" : "ubuntu",
"value" : 7
}
],
“ssidList” : [ // this is list of ssid’s in venue
{
"id" : “SSID001”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 2
}
],
“macList” : [ // this is mac list inside particular ssid ex: this is mac list inside the SSID1212
{
"id" : “22:22:22:22:22:22”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 2
}
]
}
]
},
{
"id" : “SSID002”,
"sum" : [
{
"name" : "linux",
"value" : 4
},
{
"name" : "ubuntu",
"value" : 5
}
],
“macList” : [ // this is mac list inside particular ssid ex: this is mac list inside the SSID1212
{
"id" : “22:22:22:22:22:22”, // this should be select in parameterlist 02 because there is no ssid selection in parameter list.
"sum" : [
{
"name" : "linux",
"value" : 4
},
{
"name" : "ubuntu",
"value" : 2
}
]
},
{
"id" : “44:44:44:44:44:44”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 3
}
]
}
]
}
]
},
{
"id" : “VID002”,
"sum" : [
"sum" : [
{
"name" : "linux",
"value" : 2
},
{
"name" : "linux",
"value" : 5
}
],
],
"ssidList" : [
{
"id" : “SSID001”,
"sum" : [
{
"name" : "linux",
"value" : 2
},
{
"name" : "linux",
"value" : 5
}
],
"macList" : [
{
"id" : “22:22:22:22:22:22”,
"sum" : [
{
"name" : "linux",
"value" : 2
}
{
"name" : "linux",
"value" : 5
}
]
}
]
}
]
}
]
}
请帮我解决这个问题,我将不胜感激。如果我的数据集中也有任何问题,请提及。你的评论对我更有帮助,因为我是 MongoDB.
的新生
示例查询如下,
如果方法结构像这样,getTotal(list venueIds, list ssids , list macs)
if macs!= empty && ssids != empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in: venueIds }}},
"venueList.ssidList":{ $elemMatch : { id :{$in: ssids }}},
"venueList.ssidList.macList":{ $elemMatch : { id :{$in: macs }}}
}},
{ $unwind : "$venueList" },
{ $project : { "ssidList" : "$venueList.ssidList"} },
{ $unwind : "$ssidList" },
{ $project : { "macList" : "$ssidList.macList"} },
{ $unwind : "$macList" },
{ $project : { "sum" : "$macList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
if macs == empty && ssids != empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in:venueIds}}},
"venueList.ssidList":{ $elemMatch : { id :{$in:ssids}}}
}},
{ $unwind : "$venueList" },
{ $project : { "ssidList" : "$venueList.ssidList"} },
{ $unwind : "$ssidList" },
{ $project : { "sum" : "$ssidList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
if macs == empty && ssids = empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in: venueIds}}}
}},
{ $unwind : "$venueList" },
{ $project : { "sum" : "$venueList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
如果要处理的文件太多,可以使用allowDiskUse。
我有一个类似于以下数据集的集合,它被命名为 useragents
。
我有一个用例,用于查找每个 useagents
中值的总和。在这种情况下,作为示例,我使用 useragents
作为 Linux 和 Ubuntu OS。它可以是动态的。作为我的第一步,我找到了使用聚合框架获取每个用户代理的聚合总和值的解决方案。
请参考这个
但我想通过根据给定的参数列表检查每个 venuelist
、ssidlist
、maclist
来汇总值。这对我来说是一个非常困难的问题,因为有时我的数据结构可能很复杂。
我想在给定以下参数的情况下获得每个用户代理(linux、ubuntu)的总和:
parameterlist 1
venueid :: [VID001, VID002] // this is compulsory field in parameter list ssids : [SSID001] // this is optional filed in parameter list mac : [22:22:22:22:22:22] output linux: 12 + 2 = 14 ubuntu : 2 + 5 = 7
parameterlist 2
venueid :: [VID001, VID002] // this is compulsory field in parameter list mac : [22:22:22:22:22:22] // this is optional filed in parameter list output linux: 12 + 4 + 2 = 16 ubuntu : 2 + 2 + 5 = 7
这是示例数据集
{
"_id" : ObjectId("57f940c4932a00aba387b0b0"),
"tenantID" : 1,
"date" : "2016-10-09 00:23:56",
"venueList" : [
{
"id" : “VID001”,
"sum" : [
{
"name" : "linux",
"value" : 16
},
{
"name" : "ubuntu",
"value" : 7
}
],
“ssidList” : [ // this is list of ssid’s in venue
{
"id" : “SSID001”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 2
}
],
“macList” : [ // this is mac list inside particular ssid ex: this is mac list inside the SSID1212
{
"id" : “22:22:22:22:22:22”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 2
}
]
}
]
},
{
"id" : “SSID002”,
"sum" : [
{
"name" : "linux",
"value" : 4
},
{
"name" : "ubuntu",
"value" : 5
}
],
“macList” : [ // this is mac list inside particular ssid ex: this is mac list inside the SSID1212
{
"id" : “22:22:22:22:22:22”, // this should be select in parameterlist 02 because there is no ssid selection in parameter list.
"sum" : [
{
"name" : "linux",
"value" : 4
},
{
"name" : "ubuntu",
"value" : 2
}
]
},
{
"id" : “44:44:44:44:44:44”,
"sum" : [
{
"name" : "linux",
"value" : 12
},
{
"name" : "ubuntu",
"value" : 3
}
]
}
]
}
]
},
{
"id" : “VID002”,
"sum" : [
"sum" : [
{
"name" : "linux",
"value" : 2
},
{
"name" : "linux",
"value" : 5
}
],
],
"ssidList" : [
{
"id" : “SSID001”,
"sum" : [
{
"name" : "linux",
"value" : 2
},
{
"name" : "linux",
"value" : 5
}
],
"macList" : [
{
"id" : “22:22:22:22:22:22”,
"sum" : [
{
"name" : "linux",
"value" : 2
}
{
"name" : "linux",
"value" : 5
}
]
}
]
}
]
}
]
}
请帮我解决这个问题,我将不胜感激。如果我的数据集中也有任何问题,请提及。你的评论对我更有帮助,因为我是 MongoDB.
的新生示例查询如下,
如果方法结构像这样,getTotal(list venueIds, list ssids , list macs)
if macs!= empty && ssids != empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in: venueIds }}},
"venueList.ssidList":{ $elemMatch : { id :{$in: ssids }}},
"venueList.ssidList.macList":{ $elemMatch : { id :{$in: macs }}}
}},
{ $unwind : "$venueList" },
{ $project : { "ssidList" : "$venueList.ssidList"} },
{ $unwind : "$ssidList" },
{ $project : { "macList" : "$ssidList.macList"} },
{ $unwind : "$macList" },
{ $project : { "sum" : "$macList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
if macs == empty && ssids != empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in:venueIds}}},
"venueList.ssidList":{ $elemMatch : { id :{$in:ssids}}}
}},
{ $unwind : "$venueList" },
{ $project : { "ssidList" : "$venueList.ssidList"} },
{ $unwind : "$ssidList" },
{ $project : { "sum" : "$ssidList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
if macs == empty && ssids = empty && venueIds != empty
db.getCollection('ua').aggregate(
[
{$match:{"venueList":{ $elemMatch : { id :{$in: venueIds}}}
}},
{ $unwind : "$venueList" },
{ $project : { "sum" : "$venueList.sum"} },
{ $unwind : "$sum" },
{
$group:
{
_id: "$sum.name",
total: { $sum: "$sum.value" }
}
}
]
)
如果要处理的文件太多,可以使用allowDiskUse。