以最佳性能过滤掉 CosmosDB 查询中的数组项
Filtering out array items in a CosmosDB query with best performance
在 CosmosDB 中,我能够 select 使用 ARRAY_CONTAINS 记录数组中的项目具有给定值的文档。例如:
SELECT * FROM d WHERE ARRAY_CONTAINS(d.Assignments, {'Owner':'Jason'}, true)
在上面的查询中,我得到以下返回:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
},
{
"Fruit": "Pear",
"Owner": "Amy"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Pear",
"Owner": "Liz"
},
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Liz"
},
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
不过,我还希望返回的 JSON 能够过滤掉所有与我的查询不匹配的数组项。例如:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
我更愿意在我的查询中找到一种方法来执行此操作,假设我可以以良好的性能和相对较低的请求单位来执行此操作。
返回 JSON 后在代码中过滤掉结果是否更明智?
在某些情况下,我可能有几百个数组项,其中大约 60-80% 需要过滤掉。
我为这些记录添加了 3 个类似的文档。您可以使用以下查询以最佳方式满足此要求:
SELECT f.id, ARRAY(SELECT * FROM c in f.Assignments WHERE c.Owner = 'Jason') AS Assignments FROM f WHERE ARRAY_CONTAINS(f.Assignments, {'Owner':'Jason'}, true)
结果:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "3",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "4",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "5",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
查询统计:
在 CosmosDB 中,我能够 select 使用 ARRAY_CONTAINS 记录数组中的项目具有给定值的文档。例如:
SELECT * FROM d WHERE ARRAY_CONTAINS(d.Assignments, {'Owner':'Jason'}, true)
在上面的查询中,我得到以下返回:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
},
{
"Fruit": "Pear",
"Owner": "Amy"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Pear",
"Owner": "Liz"
},
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Liz"
},
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
不过,我还希望返回的 JSON 能够过滤掉所有与我的查询不匹配的数组项。例如:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
我更愿意在我的查询中找到一种方法来执行此操作,假设我可以以良好的性能和相对较低的请求单位来执行此操作。
返回 JSON 后在代码中过滤掉结果是否更明智?
在某些情况下,我可能有几百个数组项,其中大约 60-80% 需要过滤掉。
我为这些记录添加了 3 个类似的文档。您可以使用以下查询以最佳方式满足此要求:
SELECT f.id, ARRAY(SELECT * FROM c in f.Assignments WHERE c.Owner = 'Jason') AS Assignments FROM f WHERE ARRAY_CONTAINS(f.Assignments, {'Owner':'Jason'}, true)
结果:
[
{
"id": "0",
"Assignments": [
{
"Fruit": "Apple",
"Owner": "Jason"
},
{
"Fruit": "Orange",
"Owner": "Jason"
}
]
},
{
"id": "1",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "2",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "3",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "4",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
},
{
"id": "5",
"Assignments": [
{
"Fruit": "Grape",
"Owner": "Jason"
}
]
}
]
查询统计: