将 numberDecimal 值投影到 Python 中的浮点数

Project a numberDecimal value into a float in Python

我是 MongoDB 和 Pymongo 的新手,但从他们的 Mongo 大学学到了一些经验教训。我有一个嵌套文档,我只想从中提取特定值。我目前正在尝试提取 PriceEGC 的数字部分,但没有成功。我构建投影和提取特定值的代码如下所示:

import os
import math
import pymongo
from pprint import pprint
from datetime import datetime
from bson.json_util import dumps
from bson.decimal128 import Decimal128

# more code above not shown
for collection in all_collections[:1]:
    first_seen_date = collection.name.split("_")[-1]
    projection = {
        (more projections)...,
        "RegoExpiryDate": "$Vehicle.Registration.Expiry",
        "VIN": "$_id",
        "ComplianceDate": None,
        "PriceEGC": "$Price.FinalDisplayPrice",  # <- this here is the problem
        "Price": None,
        "Reserve": "$Search.IsReservedDate",
        "StartingBid": None,
        "Odometer": "$Vehicle.Odometer",
        (more projections)...
    }

    batch_size = 1
    num_batches = math.ceil(collection.count_documents({}) / batch_size)

    for num in range(1):  # range(num_batches):
        pipeline = [
            {"$match": {}},
            {"$project": projection},
            {"$skip": batch_size * num},
            {"$limit": batch_size},
        ]
        aggregation = list(collection.aggregate(pipeline))
        yield aggregation

if __name__ == "__main__":
    print(dumps(next(get_all_collections()), indent=2))

典型的文档如下所示:

我对聚合所做的是打印出这个单个文档,先看看它是什么样子,然后再将整个集合加载到某个地方。

我不想要的输出是这样的:

[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": {
  "$numberDecimal": "36268.00"  # <- Don't want this
},
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]

我想要的输出是这样的:

[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": 36268.00, (or) "PriceEGC": "36268.00",  # <- Want this
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]

我应该如何编写投影或管道,以便我得到我想要的,如上所示?我已经尝试过:

projection = {...,
"PriceEGC": "$Price.FinalDisplayPrice.$numberDecimal",
...
}

projection = {...,
"PriceEGC": {"$toDecimal": "$Price.FinalDisplayPrice"}
...
}

projection = {...,
"PriceEGC": Decimal128.to_decimal("$Price.FinalDisplayPrice")
...
}

并改变管道

pipeline = [
    {"$match": {}},
    {"$project": projection},
    {"$toDecimal": "$Price.FinalDisplayPrice"},
    {"$skip": batch_size * num},
    {"$limit": batch_size},
]

这里只是模式匹配,但是

"name": "$name.name.name",

似乎适用于

"RegoExpiryDate": "$Vehicle.Registration.Expiry",

所以尝试与 PriceEGC 相同的模式:

    projection = {...
        "PriceEGC": "$Price.FinalDisplayPrice.numberDecimal",
    ...}

您看到的是 MongoDB Decimal128 对象的 bson.json_util() 表示。有关此数据类型的详细信息,请参阅 https://www.mongodb.com/developer/quickstart/bson-data-types-decimal128/

bson.json_util() 函数提供了 $numberDecimal 包装器,以便在您希望稍后重新加载数据时保留数据类型。

如果您想要不同的行为,那么您可能希望使用常规 json.dumps() 并覆盖 Decimal128 行为,例如

from pymongo import MongoClient
from bson.decimal128 import Decimal128
import json
import bson


class CustomJsonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, bson.decimal128.Decimal128):
            return float(obj.to_decimal())


db = MongoClient()['mydatabase']
collection = db['mycollection']

collection.insert_one({'Price': {'FinalDisplayPrice': Decimal128('36268.00')}})
print(json.dumps(list(collection.find({}, {'_id': 0})), indent=4, cls=CustomJsonEncoder))

打印:

[
    {
        "Price": {
            "FinalDisplayPrice": 36268.0
        }
    }
]