为什么 Mongo 在执行 IXSCAN 后在 FETCH 中查询空过滤器

Why Mongo query for null filters in FETCH after performing IXSCAN

根据Mongo Documentation,

The { item : null } query matches documents that either contain the item field whose value is null or that do not contain the item field.

我找不到这方面的文档,但据我所知,这两种情况(值为 null 或字段缺失)都以 null 的形式存储在索引中。

因此,如果我执行 db.orders.createIndex({item: 1}) 然后 db.orders.find({item: null}),我希望 IXSCAN 找到所有包含 item 字段且值为 [=] 的文档15=] 或不包含 item 字段的那些文件。

那为什么db.orders.find({item: null}).explain()在执行完IXSCAN之后,在FETCH阶段执行filter: {item: {$eq: null}}呢?需要过滤掉哪些可能的文档?

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "temp.orders",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "item" : {
                "$eq" : null
            }
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "item" : {
                    "$eq" : null
                }
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "item" : 1
                },
                "indexName" : "item_1",
                "isMultiKey" : false,
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "forward",
                "indexBounds" : {
                    "item" : [
                        "[null, null]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "Andys-MacBook-Pro-2.local",
        "port" : 27017,
        "version" : "3.2.8",
        "gitVersion" : "ed70e33130c977bda0024c125b56d159573dbaf0"
    },
    "ok" : 1
}

我认为 undefined 值可能会被索引为 null,但简单的实验排除了这一点:

> db.orders.createIndex({item: 1})
{
    "createdCollectionAutomatically" : true,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
> db.orders.insert({item: undefined})
WriteResult({ "nInserted" : 1 })
> db.orders.find({item: {$type: 6}}).explain()
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "temp.orders",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "item" : {
                "$type" : 6
            }
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "item" : {
                    "$type" : 6
                }
            },
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "item" : 1
                },
                "indexName" : "item_1",
                "isMultiKey" : false,
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 1,
                "direction" : "forward",
                "indexBounds" : {
                    "item" : [
                        "[undefined, undefined]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "Andys-MacBook-Pro-2.local",
        "port" : 27017,
        "version" : "3.2.8",
        "gitVersion" : "ed70e33130c977bda0024c125b56d159573dbaf0"
    },
    "ok" : 1
}

空相等匹配谓词(例如 {"a.b": null})的语义已经足够复杂,因为字段可能包含子文档,单独的索引扫描不足以提供正确的结果。

根据https://jira.mongodb.org/browse/SERVER-18653?focusedCommentId=931817&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-931817

Version 2.6.0 of the server changed the semantics of a null equality match predicate, such that the document {a: []} was no longer considered a match for the query predicate {"a.b": null} (in prior versions of the server, this document was considered a match for this predicate). This is documented in the 2.6 compatibility notes, under the "null comparison" section.

For an index with key pattern {"a.b": 1}, this document {a: []} generates the index key {"": null}. Other documents like {a: null} and the empty document {} also generate the index key {"": null}. As a result, if a query with predicate {"a.b": null} uses this index, the query system cannot tell just from the index key {"": null} whether or not the associated document matches the predicate. As a result, INEXACT_FETCH bounds are assigned instead of EXACT bounds, and hence a FETCH stage is added to the query execution tree.

补充说明:

  1. The document {} generates the index key {"": null} for the index with key pattern {"a.b": 1}.
  2. The document {a: []} also generates the index key {"": null} for the index with key pattern {"a.b": 1}.
  3. The document {} matches the query {"a.b": null}.
  4. The document {a: []} does not match the query {"a.b": null}.

Therefore, a query {"a.b": null} that is answered by an index with key pattern {"a.b": 1} must fetch the document and re-check the predicate, in order to ensure that the document {} is included in the result set and that the document {a: []} is not included in the result set.