如何访问嵌套键及其值以一次修改或删除所有内容?
How to access nested keys and their values to modify or delete all at once?
学习 python 和其他东西,因为我需要为我的项目收集大量数据,我被困在这里。我正在使用 scrapy 从 API 中抓取 json 响应,看起来像这样;
"status": "ok",
"status_message": "Query was successful",
"data": {
"product_count": 40993,
"limit": 20,
"page_number": 1,
"products": [
{
"id": 41789,
"url": "https://anything1.com",
"product_name": "product1",
"manufacturing_date": "19.12.2014",
"rating": 5.3,
"material": "something",
"description": "",
"cover_image": "anycover1.com",
"state": "ok",
"variants": [
{
"url": "https://anyvariant1.com",
"product_code": "55BEF7",
"material": "something",
"size": "small",
"dimensions": "" },
{
"url": "https://anyvariant2.com",
"product_code": "55BEF8",
"material": "something",
"size": "medium",
"dimensions": "" },
{
"url": "https://anyvariant3.com",
"product_code": "55BEF9",
"material": "something",
"size": "large",
"dimensions": "" }
]
},
{
"id": 41790,
"url": "https://anything2.com",
"product_name": "product2",
"manufacturing_date": "02.10.2014",
"rating": 7.2,
"material": "something",
"description": "",
"cover_image": "anycover2.com",
"state": "ok",
"variants": [
{
"url": "https://anyvariant4.com",
"product_code": "55BEG7",
"material": "something",
"size": "small",
"dimensions": "" },
{
"url": "https://anyvariant5.com",
"product_code": "55BEG8",
"material": "something",
"size": "medium",
"dimensions": "" },
{
"url": "https://anyvariant6.com",
"product_code": "55BEG9",
"material": "something",
"size": "large",
"dimensions": "" }
]
},
{
_______
},
{
_______
}
]
},
"@meta": {
"server_time": 1651288705,
"execution_time": "0.01 ms"
}
}
这就是我的爬虫代码的样子;
data = json.loads(response.body)
data_main = data['data']['products']
product_list = []
for item in data_main:
id = item['id']
url = item['url']
product_name = item['product_name']
rating = item['rating']
cover_image = item['cover_image']
description = item['description']
product = {
'id': id,
'url': url,
'name': product_name,
'image': cover_image,
'rating': rating,
'description': description
}
product_list.append(product)
return product_list
使用此键和 id 的值,url、名称、图像、评级、描述都可以访问。但是无法一次访问和修改嵌套的键及其值(并忽略一些键和值)。那我该怎么做呢?如果还有其他更好的代码可以实现我的需要,请提出建议。
非常感谢。
对于嵌套的键和值,我假设您指的是 variants
下的键和值。您可以通过与遍历项目相同的方式访问它们:
variant_list = []
for variant in item[variants]:
url = variant['url']
# and so on... for whatever other keys you're interested in
new_variant = {'url':url} # and whatever other keys you want
variant_list.append(new_variant)
不过我想知道,为什么您要重建与 JSON 给您的词典相似的词典?出于许多目的,您不妨坚持使用 JSON 给您的词典。
学习 python 和其他东西,因为我需要为我的项目收集大量数据,我被困在这里。我正在使用 scrapy 从 API 中抓取 json 响应,看起来像这样;
"status": "ok",
"status_message": "Query was successful",
"data": {
"product_count": 40993,
"limit": 20,
"page_number": 1,
"products": [
{
"id": 41789,
"url": "https://anything1.com",
"product_name": "product1",
"manufacturing_date": "19.12.2014",
"rating": 5.3,
"material": "something",
"description": "",
"cover_image": "anycover1.com",
"state": "ok",
"variants": [
{
"url": "https://anyvariant1.com",
"product_code": "55BEF7",
"material": "something",
"size": "small",
"dimensions": "" },
{
"url": "https://anyvariant2.com",
"product_code": "55BEF8",
"material": "something",
"size": "medium",
"dimensions": "" },
{
"url": "https://anyvariant3.com",
"product_code": "55BEF9",
"material": "something",
"size": "large",
"dimensions": "" }
]
},
{
"id": 41790,
"url": "https://anything2.com",
"product_name": "product2",
"manufacturing_date": "02.10.2014",
"rating": 7.2,
"material": "something",
"description": "",
"cover_image": "anycover2.com",
"state": "ok",
"variants": [
{
"url": "https://anyvariant4.com",
"product_code": "55BEG7",
"material": "something",
"size": "small",
"dimensions": "" },
{
"url": "https://anyvariant5.com",
"product_code": "55BEG8",
"material": "something",
"size": "medium",
"dimensions": "" },
{
"url": "https://anyvariant6.com",
"product_code": "55BEG9",
"material": "something",
"size": "large",
"dimensions": "" }
]
},
{
_______
},
{
_______
}
]
},
"@meta": {
"server_time": 1651288705,
"execution_time": "0.01 ms"
}
}
这就是我的爬虫代码的样子;
data = json.loads(response.body)
data_main = data['data']['products']
product_list = []
for item in data_main:
id = item['id']
url = item['url']
product_name = item['product_name']
rating = item['rating']
cover_image = item['cover_image']
description = item['description']
product = {
'id': id,
'url': url,
'name': product_name,
'image': cover_image,
'rating': rating,
'description': description
}
product_list.append(product)
return product_list
使用此键和 id 的值,url、名称、图像、评级、描述都可以访问。但是无法一次访问和修改嵌套的键及其值(并忽略一些键和值)。那我该怎么做呢?如果还有其他更好的代码可以实现我的需要,请提出建议。 非常感谢。
对于嵌套的键和值,我假设您指的是 variants
下的键和值。您可以通过与遍历项目相同的方式访问它们:
variant_list = []
for variant in item[variants]:
url = variant['url']
# and so on... for whatever other keys you're interested in
new_variant = {'url':url} # and whatever other keys you want
variant_list.append(new_variant)
不过我想知道,为什么您要重建与 JSON 给您的词典相似的词典?出于许多目的,您不妨坚持使用 JSON 给您的词典。