MongoDB嵌套数组的投影
MongoDB Projection of Nested Arrays
我有一个集合 "accounts",其中包含类似于此结构的文档:
{
"email" : "john.doe@acme.com",
"groups" : [
{
"name" : "group1",
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c2", "address" : "some address 2" },
{ "localId" : "c3", "address" : "some address 3" }
]
},
{
"name" : "group2",
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c3", "address" : "some address 3" }
]
}
]
}
通过
q = { "email" : "john.doe@acme.com", "groups" : { $elemMatch: { "name" : "group1" } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1" } } }
db.accounts.find( q, p ).pretty()
我将成功获取我感兴趣的指定帐户的组。
问题:如何在指定"account"的某个"group"范围内获取"contacts"的有限列表?假设我有以下参数:
- 帐户:电子邮件 - "john.doe@acme.com"
- 组:姓名-"group1"
- 联系人:本地 ID 数组 - ["c1"、"c3"、"Not existing id"]
鉴于这些论点,我希望得到以下结果:
{
"groups" : [
{
"name" : "group1", (might be omitted)
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c3", "address" : "some address 3" }
]
}
]
}
除了生成的联系人,我不需要任何其他东西。
方法
为简单起见,所有查询都尝试只获取一个匹配的联系人,而不是匹配的联系人列表。
我尝试了以下查询但没有成功:
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts" : { $elemMatch: { "localId" : "c1" } } } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts.localId" : "c1" } } }
not working: returns whole array or nothing depending on localId
p = { "groups.$" : { $elemMatch: { "localId" : "c1" } } }
error: {
"$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
"code" : 17287
}
p = { "groups.contacts" : { $elemMatch: { "localId" : "c1" } } }
error: {
"$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
"code" : 17287
}
感谢任何帮助!
您可以使用聚合框架的 $unwind 运算符。
例如:
db.contact.aggregate({$unwind:'$groups'}, {$unwind:'$groups.contacts'}, {$match:{email:'john.doe@acme.com', 'groups.name':'group1', 'groups.contacts.localId':{$in:['c1', 'c3', 'whatever']}}});
应该给出以下结果:
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "john.doe@acme.com", "groups" : { "name" : "group1", "contacts" : { "localId" : "c1", "address" : "some address 1" } } }
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "john.doe@acme.com", "groups" : { "name" : "group1", "contacts" : { "localId" : "c3", "address" : "some address 3" } } }
如果您只需要一个对象,则可以使用 $group 运算符。
2017 年更新
这样一个提得很好的问题值得一个现代的回应。所请求的数组过滤实际上可以在现代 MongoDB 版本 post 3.2 中通过简单的 $match
and $project
管道阶段完成,就像原始的普通查询操作意图一样。
db.accounts.aggregate([
{ "$match": {
"email" : "john.doe@acme.com",
"groups": {
"$elemMatch": {
"name": "group1",
"contacts.localId": { "$in": [ "c1","c3", null ] }
}
}
}},
{ "$addFields": {
"groups": {
"$filter": {
"input": {
"$map": {
"input": "$groups",
"as": "g",
"in": {
"name": "$$g.name",
"contacts": {
"$filter": {
"input": "$$g.contacts",
"as": "c",
"cond": {
"$or": [
{ "$eq": [ "$$c.localId", "c1" ] },
{ "$eq": [ "$$c.localId", "c3" ] }
]
}
}
}
}
}
},
"as": "g",
"cond": {
"$and": [
{ "$eq": [ "$$g.name", "group1" ] },
{ "$gt": [ { "$size": "$$g.contacts" }, 0 ] }
]
}
}
}
}}
])
这利用了$filter
and $map
operators to only return the elements from the arrays as would meet the conditions, and is far better for performance than using $unwind
。由于流水线阶段有效地反映了 .find()
操作中 "query" 和 "project" 的结构,因此这里的性能基本上与此类操作相当。
请注意,实际目的是 "across documents" 将 "multiple" 文档而不是 "one" 中的详细信息整合在一起,然后为此,这通常需要某种类型的 $unwind
操作,这样才能使 "grouping".
可以访问数组项
基本上就是这个方法:
db.accounts.aggregate([
// Match the documents by query
{ "$match": {
"email" : "john.doe@acme.com",
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// De-normalize nested array
{ "$unwind": "$groups" },
{ "$unwind": "$groups.contacts" },
// Filter the actual array elements as desired
{ "$match": {
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// Group the intermediate result.
{ "$group": {
"_id": { "email": "$email", "name": "$groups.name" },
"contacts": { "$push": "$groups.contacts" }
}},
// Group the final result
{ "$group": {
"_id": "$_id.email",
"groups": { "$push": {
"name": "$_id.name",
"contacts": "$contacts"
}}
}}
])
这是 "array filtering" 的多场比赛,.find()
的基本投影能力无法做到。
您有 "nested" 个数组,因此您需要处理 $unwind
两次。连同其他操作。
我有一个集合 "accounts",其中包含类似于此结构的文档:
{
"email" : "john.doe@acme.com",
"groups" : [
{
"name" : "group1",
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c2", "address" : "some address 2" },
{ "localId" : "c3", "address" : "some address 3" }
]
},
{
"name" : "group2",
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c3", "address" : "some address 3" }
]
}
]
}
通过
q = { "email" : "john.doe@acme.com", "groups" : { $elemMatch: { "name" : "group1" } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1" } } }
db.accounts.find( q, p ).pretty()
我将成功获取我感兴趣的指定帐户的组。
问题:如何在指定"account"的某个"group"范围内获取"contacts"的有限列表?假设我有以下参数:
- 帐户:电子邮件 - "john.doe@acme.com"
- 组:姓名-"group1"
- 联系人:本地 ID 数组 - ["c1"、"c3"、"Not existing id"]
鉴于这些论点,我希望得到以下结果:
{
"groups" : [
{
"name" : "group1", (might be omitted)
"contacts" : [
{ "localId" : "c1", "address" : "some address 1" },
{ "localId" : "c3", "address" : "some address 3" }
]
}
]
}
除了生成的联系人,我不需要任何其他东西。
方法
为简单起见,所有查询都尝试只获取一个匹配的联系人,而不是匹配的联系人列表。 我尝试了以下查询但没有成功:
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts" : { $elemMatch: { "localId" : "c1" } } } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts.localId" : "c1" } } }
not working: returns whole array or nothing depending on localId
p = { "groups.$" : { $elemMatch: { "localId" : "c1" } } }
error: {
"$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
"code" : 17287
}
p = { "groups.contacts" : { $elemMatch: { "localId" : "c1" } } }
error: {
"$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
"code" : 17287
}
感谢任何帮助!
您可以使用聚合框架的 $unwind 运算符。 例如:
db.contact.aggregate({$unwind:'$groups'}, {$unwind:'$groups.contacts'}, {$match:{email:'john.doe@acme.com', 'groups.name':'group1', 'groups.contacts.localId':{$in:['c1', 'c3', 'whatever']}}});
应该给出以下结果:
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "john.doe@acme.com", "groups" : { "name" : "group1", "contacts" : { "localId" : "c1", "address" : "some address 1" } } }
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "john.doe@acme.com", "groups" : { "name" : "group1", "contacts" : { "localId" : "c3", "address" : "some address 3" } } }
如果您只需要一个对象,则可以使用 $group 运算符。
2017 年更新
这样一个提得很好的问题值得一个现代的回应。所请求的数组过滤实际上可以在现代 MongoDB 版本 post 3.2 中通过简单的 $match
and $project
管道阶段完成,就像原始的普通查询操作意图一样。
db.accounts.aggregate([
{ "$match": {
"email" : "john.doe@acme.com",
"groups": {
"$elemMatch": {
"name": "group1",
"contacts.localId": { "$in": [ "c1","c3", null ] }
}
}
}},
{ "$addFields": {
"groups": {
"$filter": {
"input": {
"$map": {
"input": "$groups",
"as": "g",
"in": {
"name": "$$g.name",
"contacts": {
"$filter": {
"input": "$$g.contacts",
"as": "c",
"cond": {
"$or": [
{ "$eq": [ "$$c.localId", "c1" ] },
{ "$eq": [ "$$c.localId", "c3" ] }
]
}
}
}
}
}
},
"as": "g",
"cond": {
"$and": [
{ "$eq": [ "$$g.name", "group1" ] },
{ "$gt": [ { "$size": "$$g.contacts" }, 0 ] }
]
}
}
}
}}
])
这利用了$filter
and $map
operators to only return the elements from the arrays as would meet the conditions, and is far better for performance than using $unwind
。由于流水线阶段有效地反映了 .find()
操作中 "query" 和 "project" 的结构,因此这里的性能基本上与此类操作相当。
请注意,实际目的是 "across documents" 将 "multiple" 文档而不是 "one" 中的详细信息整合在一起,然后为此,这通常需要某种类型的 $unwind
操作,这样才能使 "grouping".
基本上就是这个方法:
db.accounts.aggregate([
// Match the documents by query
{ "$match": {
"email" : "john.doe@acme.com",
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// De-normalize nested array
{ "$unwind": "$groups" },
{ "$unwind": "$groups.contacts" },
// Filter the actual array elements as desired
{ "$match": {
"groups.name": "group1",
"groups.contacts.localId": { "$in": [ "c1","c3", null ] },
}},
// Group the intermediate result.
{ "$group": {
"_id": { "email": "$email", "name": "$groups.name" },
"contacts": { "$push": "$groups.contacts" }
}},
// Group the final result
{ "$group": {
"_id": "$_id.email",
"groups": { "$push": {
"name": "$_id.name",
"contacts": "$contacts"
}}
}}
])
这是 "array filtering" 的多场比赛,.find()
的基本投影能力无法做到。
您有 "nested" 个数组,因此您需要处理 $unwind
两次。连同其他操作。