MongoDB Compass 显示键数据分布的最小值错误
MongoDB Compass shows bad minimum value of data distribution of a key
我正在为 mac 使用 MongoDB Compass 版本 1.5.1。
当我查看值的分布时,Compass returns 绘制如下所示:
如您所见,min
和 max
值可用。但是最小值是错误的。我知道这两个键的最小值是 1
和 1
,而不是 9
和 13
。
有人知道如何解决这个问题吗?
知道了。标准报告基于最多 1000 个文档的样本。
来自文档:
Sampling in MongoDB Compass is the practice of selecting a subset of
data from the desired collection and analyzing the documents within
the sample set.
Sampling is commonly used in statistical analysis because analyzing a
subset of data gives similar results to analyzing all of the data. In
addition, sampling allows results to be generated quickly rather than
performing a potentially long and computationally expensive collection
scan.
MongoDB Compass employs two distinct sampling mechanisms.
Collections in MongoDB 3.2 are sampled via the $sample operator in the
aggregation framework of the core server. This provides efficient
random sampling without replacement over the entire collection, or
over the subset of documents specified by a query.
Collections in MongoDB 3.0 and 2.6 are sampled via a backwards
compatible algorithm executed entirely within Compass. It comprises
three phases:
- Query for a stream of _id values, limit 10000 descending by _id
- Read the stream of _ids and save sampleSize randomly chosen values. We
employ reservoir sampling to perform this efficiently.
- Then query the selected random documents by _id The choice of sampling > method is transparent in usage to the end-user.
sampleSize
is currently set to 1000 documents.
我正在为 mac 使用 MongoDB Compass 版本 1.5.1。
当我查看值的分布时,Compass returns 绘制如下所示:
如您所见,min
和 max
值可用。但是最小值是错误的。我知道这两个键的最小值是 1
和 1
,而不是 9
和 13
。
有人知道如何解决这个问题吗?
知道了。标准报告基于最多 1000 个文档的样本。
来自文档:
Sampling in MongoDB Compass is the practice of selecting a subset of data from the desired collection and analyzing the documents within the sample set.
Sampling is commonly used in statistical analysis because analyzing a subset of data gives similar results to analyzing all of the data. In addition, sampling allows results to be generated quickly rather than performing a potentially long and computationally expensive collection scan.
MongoDB Compass employs two distinct sampling mechanisms.
Collections in MongoDB 3.2 are sampled via the $sample operator in the aggregation framework of the core server. This provides efficient random sampling without replacement over the entire collection, or over the subset of documents specified by a query.
Collections in MongoDB 3.0 and 2.6 are sampled via a backwards compatible algorithm executed entirely within Compass. It comprises three phases:
- Query for a stream of _id values, limit 10000 descending by _id
- Read the stream of _ids and save sampleSize randomly chosen values. We employ reservoir sampling to perform this efficiently.
- Then query the selected random documents by _id The choice of sampling > method is transparent in usage to the end-user.
sampleSize
is currently set to 1000 documents.