AWS DocumentDB 在同时使用 $sort 和 $match 时不使用索引

Question

DocumentDB 忽略任何字段的索引而不是排序

db.requests.aggregate([
    { $match: {'DeviceId': '5f68c9c1-73c1-e5cb-7a0b-90be2f80a332'}},
    { $sort: { 'Timestamp': 1 } }
])

有用信息：

> explain('executionStats')
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "admin_portal.requests",
        "winningPlan" : {
            "stage" : "IXSCAN",
            "indexName" : "Timestamp_1",
            "direction" : "forward"
        }
    },
    "executionStats" : {
        "executionSuccess" : true,
        "executionTimeMillis" : "398883.755",
        "planningTimeMillis" : "0.274",
        "executionStages" : {
            "stage" : "IXSCAN",
            "nReturned" : "20438",
            "executionTimeMillisEstimate" : "398879.028",
            "indexName" : "Timestamp_1",
            "direction" : "forward"
        }
    },
    "serverInfo" : {
       ...
    },
    "ok" : 1.0,
    "operationTime" : Timestamp(1622585939, 1)
}

> db.requests.getIndexKeys()
[
    {
        "_id" : 1
    },
    {
        "Timestamp" : 1
    },
    {
        "DeviceId" : 1
    }
]

当我在没有排序的情况下查询文档或者当我使用 find 和 sort 函数而不是聚合时，它工作正常。

重要说明： 它在原始 MongoDB 实例上也能完美运行，但在 DocumentDB

Answer 1

这更像是“DocumentDB 如何选择查询计划”之类的问题。关于如何 Mongo 在 Whosebug 上有很多答案。

很明显，基于数据分布的失败试验可能会选择“错误”的索引，这里的问题是 DocumentDB 添加了一个 unknown layer.

Amazon DocumentDB emulates the MongoDB 4.0 API on a purpose-built database engine that utilizes a distributed, fault-tolerant, self-healing storage system. As a result, query plans and the output of explain() may differ between Amazon DocumentDB and MongoDB. Customers who want control over their query plan can use the $hint operator to enforce selection of a preferred index.

他们说由于这一层差异可能会发生。

现在我们明白了为什么选择了错误的索引（有点）。我们可以做什么？好吧，除非您想以某种方式删除或重建索引，否则您需要为您的管道使用 hint 选项。

db.collection.aggregate(pipeline, {hint: "index_name"})

AWS DocumentDB 在同时使用 $sort 和 $match 时不使用索引

AWS DocumentDB does not use indexes when $sort and $match at the same time

mongodb

aws-documentdb