正确使用 allow_disk_usage 和 pymongo

correcly using allow_disk_usage with pymongo

我有一个对象 运行 适用于较小的集合,但我的集合很大,有超过 700 万份文档。我实际上是在尝试按键分组,key1 和 key2

def groupByThreeItems(self, db=None, col=None, key=None, key1=None, key2=None):
    agg_result= coll.aggregate([{
         {'_id': { key: "$"+key, key1: "$"+key1},
           key2: { "$push":  "$"+key2 }, "Count":{"$sum": 1}
         }}],{'allow_disk_use': True})
    return [i for i in agg_result]


AttributeError: 'dict' object has no attribute '_txn_read_preference'

但是,当我不使用 allow_disk_use 时,出现以下错误。

pymongo.errors.OperationFailure: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in., full error: {'ok': 0.0, 'errmsg': "Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", 'code': 292, 'codeName': 'QueryExceededMemoryLimitNoDiskUseAllowed'}



agg_result= coll.aggregate([...],allowDiskUse=True)


agg_result= coll.aggregate([..query], allowDiskUse=True)


All optional aggregate command parameters should be passed as keyword arguments to this method. Valid options include, but are not limited to:

allowDiskUse (bool): Enables writing to temporary files. When set to True, aggregation stages can write data to the _tmp subdirectory of the –dbpath directory. The default is False.


allowDiskUse boolean Optional. Enables writing to temporary files. When set to true, aggregation operations can write data to the _tmp subdirectory in the dbPath directory. See Perform Large Sort Operation with External Sort for an example.

Starting in MongoDB 4.2, the profiler log messages and diagnostic log messages includes a usedDisk indicator if any aggregation stage wrote data to temporary files due to memory restrictions.