mongodb 排序时似乎没有使用多键索引
mongodb multikey index seems like not being used when sort
假设我有 tx_collection,其中有 3 个文档,如下所示
{
"block_number": 1,
"value": 122
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 111
},
{
"from": "foo3",
"to": "bar3",
"amount": 11
},
]
},
{
"block_number": 2,
"value": 88
"transfers": [
{
"from": "foo11",
"to": "bar11",
"amount": 33
},
{
"from": "foo22",
"to": "bar22",
"amount": 55
},
]
},
{
"block_number": 3,
"value": 233
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 33
},
{
"from": "foo3",
"to": "bar3",
"amount": 200
},
]
}
为了性能问题,我在transfers.amount
上创建了多键索引
当我按 transfers.amount
、
排序时
db.getCollection('tx_transaction').find({}).sort({"transfers.amount":-1})
我期望的文档顺序是按子字段的最大值排序的transfers.amount
like
{
"block_number": 3,
"value": 233
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 33
},
{
"from": "foo3",
"to": "bar3",
"amount": 200
},
]
},
{
"block_number": 1,
"value": 122
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 111
},
{
"from": "foo3",
"to": "bar3",
"amount": 11
},
]
},
{
"block_number": 2,
"value": 88
"transfers": [
{
"from": "foo11",
"to": "bar11",
"amount": 33
},
{
"from": "foo22",
"to": "bar22",
"amount": 55
},
]
}
排序效果很好,因为只有 3 个文档。排序顺序是块号 3 -> 块号 1 -> block_number 2 我期望
我的问题是,当有 1900 万个文档时,它会抛出错误消息
按摩很赞
"errmsg" : "Executor error during find command: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.",
排序的时候好像没有用到多键索引
您知道为什么会抛出此错误消息吗?
仅供参考。
- 我的mongodb版本是3.6.3
- tx_collection 已分片
从 MongoDB 3.6 和更新版本开始,我认为这是预期的,正如 Use Indexes to Sort Query Results 中提到的那样:
As a result of changes to sorting behavior on array fields in MongoDB 3.6, when sorting on an array indexed with a multikey index the query plan includes a blocking SORT stage. The new sorting behavior may negatively impact performance.
In a blocking SORT, all input must be consumed by the sort step before it can produce output. In a non-blocking, or indexed sort, the sort step scans the index to produce results in the requested order.
换句话说,“阻塞排序”意味着SORT_KEY_GENERATOR
阶段的存在,即内存排序的阶段。这是从 pre-3.6 MongoDB 更改而来的,因为 SERVER-19402 解决了数组字段排序的不一致问题。
有一张票可以改善这种情况:SERVER-31898。不幸的是,目前还没有针对此行为的解决方法。
假设我有 tx_collection,其中有 3 个文档,如下所示
{
"block_number": 1,
"value": 122
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 111
},
{
"from": "foo3",
"to": "bar3",
"amount": 11
},
]
},
{
"block_number": 2,
"value": 88
"transfers": [
{
"from": "foo11",
"to": "bar11",
"amount": 33
},
{
"from": "foo22",
"to": "bar22",
"amount": 55
},
]
},
{
"block_number": 3,
"value": 233
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 33
},
{
"from": "foo3",
"to": "bar3",
"amount": 200
},
]
}
为了性能问题,我在transfers.amount
当我按 transfers.amount
、
db.getCollection('tx_transaction').find({}).sort({"transfers.amount":-1})
我期望的文档顺序是按子字段的最大值排序的transfers.amount
like
{
"block_number": 3,
"value": 233
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 33
},
{
"from": "foo3",
"to": "bar3",
"amount": 200
},
]
},
{
"block_number": 1,
"value": 122
"transfers": [
{
"from": "foo1",
"to": "bar1",
"amount": 111
},
{
"from": "foo3",
"to": "bar3",
"amount": 11
},
]
},
{
"block_number": 2,
"value": 88
"transfers": [
{
"from": "foo11",
"to": "bar11",
"amount": 33
},
{
"from": "foo22",
"to": "bar22",
"amount": 55
},
]
}
排序效果很好,因为只有 3 个文档。排序顺序是块号 3 -> 块号 1 -> block_number 2 我期望
我的问题是,当有 1900 万个文档时,它会抛出错误消息
按摩很赞
"errmsg" : "Executor error during find command: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.",
排序的时候好像没有用到多键索引
您知道为什么会抛出此错误消息吗?
仅供参考。
- 我的mongodb版本是3.6.3
- tx_collection 已分片
从 MongoDB 3.6 和更新版本开始,我认为这是预期的,正如 Use Indexes to Sort Query Results 中提到的那样:
As a result of changes to sorting behavior on array fields in MongoDB 3.6, when sorting on an array indexed with a multikey index the query plan includes a blocking SORT stage. The new sorting behavior may negatively impact performance.
In a blocking SORT, all input must be consumed by the sort step before it can produce output. In a non-blocking, or indexed sort, the sort step scans the index to produce results in the requested order.
换句话说,“阻塞排序”意味着SORT_KEY_GENERATOR
阶段的存在,即内存排序的阶段。这是从 pre-3.6 MongoDB 更改而来的,因为 SERVER-19402 解决了数组字段排序的不一致问题。
有一张票可以改善这种情况:SERVER-31898。不幸的是,目前还没有针对此行为的解决方法。