将 numberDecimal 值投影到 Python 中的浮点数
Project a numberDecimal value into a float in Python
我是 MongoDB 和 Pymongo 的新手,但从他们的 Mongo 大学学到了一些经验教训。我有一个嵌套文档,我只想从中提取特定值。我目前正在尝试提取 PriceEGC 的数字部分,但没有成功。我构建投影和提取特定值的代码如下所示:
import os
import math
import pymongo
from pprint import pprint
from datetime import datetime
from bson.json_util import dumps
from bson.decimal128 import Decimal128
# more code above not shown
for collection in all_collections[:1]:
first_seen_date = collection.name.split("_")[-1]
projection = {
(more projections)...,
"RegoExpiryDate": "$Vehicle.Registration.Expiry",
"VIN": "$_id",
"ComplianceDate": None,
"PriceEGC": "$Price.FinalDisplayPrice", # <- this here is the problem
"Price": None,
"Reserve": "$Search.IsReservedDate",
"StartingBid": None,
"Odometer": "$Vehicle.Odometer",
(more projections)...
}
batch_size = 1
num_batches = math.ceil(collection.count_documents({}) / batch_size)
for num in range(1): # range(num_batches):
pipeline = [
{"$match": {}},
{"$project": projection},
{"$skip": batch_size * num},
{"$limit": batch_size},
]
aggregation = list(collection.aggregate(pipeline))
yield aggregation
if __name__ == "__main__":
print(dumps(next(get_all_collections()), indent=2))
典型的文档如下所示:
我对聚合所做的是打印出这个单个文档,先看看它是什么样子,然后再将整个集合加载到某个地方。
我不想要的输出是这样的:
[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": {
"$numberDecimal": "36268.00" # <- Don't want this
},
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]
我想要的输出是这样的:
[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": 36268.00, (or) "PriceEGC": "36268.00", # <- Want this
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]
我应该如何编写投影或管道,以便我得到我想要的,如上所示?我已经尝试过:
projection = {...,
"PriceEGC": "$Price.FinalDisplayPrice.$numberDecimal",
...
}
和
projection = {...,
"PriceEGC": {"$toDecimal": "$Price.FinalDisplayPrice"}
...
}
和
projection = {...,
"PriceEGC": Decimal128.to_decimal("$Price.FinalDisplayPrice")
...
}
并改变管道
pipeline = [
{"$match": {}},
{"$project": projection},
{"$toDecimal": "$Price.FinalDisplayPrice"},
{"$skip": batch_size * num},
{"$limit": batch_size},
]
这里只是模式匹配,但是
"name": "$name.name.name",
似乎适用于
"RegoExpiryDate": "$Vehicle.Registration.Expiry",
所以尝试与 PriceEGC
相同的模式:
projection = {...
"PriceEGC": "$Price.FinalDisplayPrice.numberDecimal",
...}
您看到的是 MongoDB Decimal128 对象的 bson.json_util()
表示。有关此数据类型的详细信息,请参阅 https://www.mongodb.com/developer/quickstart/bson-data-types-decimal128/
bson.json_util()
函数提供了 $numberDecimal
包装器,以便在您希望稍后重新加载数据时保留数据类型。
如果您想要不同的行为,那么您可能希望使用常规 json.dumps()
并覆盖 Decimal128 行为,例如
from pymongo import MongoClient
from bson.decimal128 import Decimal128
import json
import bson
class CustomJsonEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, bson.decimal128.Decimal128):
return float(obj.to_decimal())
db = MongoClient()['mydatabase']
collection = db['mycollection']
collection.insert_one({'Price': {'FinalDisplayPrice': Decimal128('36268.00')}})
print(json.dumps(list(collection.find({}, {'_id': 0})), indent=4, cls=CustomJsonEncoder))
打印:
[
{
"Price": {
"FinalDisplayPrice": 36268.0
}
}
]
我是 MongoDB 和 Pymongo 的新手,但从他们的 Mongo 大学学到了一些经验教训。我有一个嵌套文档,我只想从中提取特定值。我目前正在尝试提取 PriceEGC 的数字部分,但没有成功。我构建投影和提取特定值的代码如下所示:
import os
import math
import pymongo
from pprint import pprint
from datetime import datetime
from bson.json_util import dumps
from bson.decimal128 import Decimal128
# more code above not shown
for collection in all_collections[:1]:
first_seen_date = collection.name.split("_")[-1]
projection = {
(more projections)...,
"RegoExpiryDate": "$Vehicle.Registration.Expiry",
"VIN": "$_id",
"ComplianceDate": None,
"PriceEGC": "$Price.FinalDisplayPrice", # <- this here is the problem
"Price": None,
"Reserve": "$Search.IsReservedDate",
"StartingBid": None,
"Odometer": "$Vehicle.Odometer",
(more projections)...
}
batch_size = 1
num_batches = math.ceil(collection.count_documents({}) / batch_size)
for num in range(1): # range(num_batches):
pipeline = [
{"$match": {}},
{"$project": projection},
{"$skip": batch_size * num},
{"$limit": batch_size},
]
aggregation = list(collection.aggregate(pipeline))
yield aggregation
if __name__ == "__main__":
print(dumps(next(get_all_collections()), indent=2))
典型的文档如下所示:
我对聚合所做的是打印出这个单个文档,先看看它是什么样子,然后再将整个集合加载到某个地方。
我不想要的输出是这样的:
[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": {
"$numberDecimal": "36268.00" # <- Don't want this
},
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]
我想要的输出是这样的:
[{
(more key-value pairs)...,
"RegoExpiryDate": "2021-08-31T00:00:00.000Z",
"VIN": "JTMRBREVX0D087618",
"ComplianceDate": null,
"PriceEGC": 36268.00, (or) "PriceEGC": "36268.00", # <- Want this
"Price": null,
"Reserve": null,
"StartingBid": null,
"Odometer": 54567,
(more key-value pairs)...
}]
我应该如何编写投影或管道,以便我得到我想要的,如上所示?我已经尝试过:
projection = {...,
"PriceEGC": "$Price.FinalDisplayPrice.$numberDecimal",
...
}
和
projection = {...,
"PriceEGC": {"$toDecimal": "$Price.FinalDisplayPrice"}
...
}
和
projection = {...,
"PriceEGC": Decimal128.to_decimal("$Price.FinalDisplayPrice")
...
}
并改变管道
pipeline = [
{"$match": {}},
{"$project": projection},
{"$toDecimal": "$Price.FinalDisplayPrice"},
{"$skip": batch_size * num},
{"$limit": batch_size},
]
这里只是模式匹配,但是
"name": "$name.name.name",
似乎适用于
"RegoExpiryDate": "$Vehicle.Registration.Expiry",
所以尝试与 PriceEGC
相同的模式:
projection = {...
"PriceEGC": "$Price.FinalDisplayPrice.numberDecimal",
...}
您看到的是 MongoDB Decimal128 对象的 bson.json_util()
表示。有关此数据类型的详细信息,请参阅 https://www.mongodb.com/developer/quickstart/bson-data-types-decimal128/
bson.json_util()
函数提供了 $numberDecimal
包装器,以便在您希望稍后重新加载数据时保留数据类型。
如果您想要不同的行为,那么您可能希望使用常规 json.dumps()
并覆盖 Decimal128 行为,例如
from pymongo import MongoClient
from bson.decimal128 import Decimal128
import json
import bson
class CustomJsonEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, bson.decimal128.Decimal128):
return float(obj.to_decimal())
db = MongoClient()['mydatabase']
collection = db['mycollection']
collection.insert_one({'Price': {'FinalDisplayPrice': Decimal128('36268.00')}})
print(json.dumps(list(collection.find({}, {'_id': 0})), indent=4, cls=CustomJsonEncoder))
打印:
[
{
"Price": {
"FinalDisplayPrice": 36268.0
}
}
]