如何解析 json 多层下的数据?
How do you parse json data that is under multiple layers?
代码的目标是获取 json 我获取的信息并将其解析为原始位置数据(地址、邮政等)。我对编码很陌生,这是我在学习地理时遇到的一项学校项目的一次性任务,并且需要 Canada.So 中所有麦当劳的位置,欢迎使用任何其他学习工具。但是我遇到的主要问题是我想写
for blank in blanks['']:
这样我就可以获取 csv 输出的数据。但是我注意到我的数据在多层下。
例如:
{
"features": [
{
"geometry": {
"coordinates": [
-79.28662,
43.68758
]
},
"properties": {
"name": "Vic Park/Gerrard",
"shortDescription": "VIC PARK/G",
"longDescription": "VIC PARK/GERRARD",
"todayHours": "06:00 - 22:00",
"driveTodayHours": "00:00 - 00:00",
"id": "195500517230-en-ca",
"filterType": [
"ALL_DAY_BREAKFAST",
"BAKERY",
"BREAKFAST",
"CYT",
"DRIVETHRU",
"INDOORDINING",
"MCCAFE",
"MOBILEOFFERS",
"MOBILEORDERS",
"PARKINGAREA",
"TWENTYFOURHOURS",
"WIFI"
],
"addressLine1": "2480 GERRARD STREET EAST",
"addressLine2": "",
"addressLine3": "SCARBOROUGH",
"addressLine4": "Canada",
"subDivision": "",
"postcode": "M1N 4C3",
"customAddress": "SCARBOROUGH, M1N 4C3",
"telephone": "4166903659",
我想要的信息在我看来(不确定)的属性下,但我的
for store in stores['features']:
声明。不允许我为 csv 单独获取 'addressLine1' 信息或其他信息。我想知道是否有人有解决方案来解析这样的数据。
P.S 我包含了我的全部代码以防出现更深层次的问题。
import requests
import csv
import json
url = "https://www.mcdonalds.com/googleapps/GoogleRestaurantLocAction.do?method=searchLocation&latitude=43.6936965&longitude=-79.2969938&radius=1000000&maxResults=1700&country=ca&language=en-ca&showClosed=&hours24Text=Open%2024%20hr"
payload={}
files={}
headers = {
'authority': 'www.mcdonalds.com',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'accept': '*/*',
'x-requested-with': 'XMLHttpRequest',
'sec-ch-ua-mobile': '?0',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://www.mcdonalds.com/ca/en-ca/restaurant-locator.html',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'cookie': 'bm_sz=C04645E7F7A956C5F9D9C5A20DEAEC97~YAAQ1Cv2SEtfMBN6AQAAItxfEwwTVV2V2Tr7UWpPt1Ps7gl84FzQlmbWIm4kBBh5dxlK3w8RenwiEiKtvERE6dLmrwPwJUuy+14gU/LeEZvP+uxzyBr04oQXdcSEQuiOgdkAGasqnBrTw1mp5E5iehnRpvHBDdSqh8wRSgJV0eG4f8YwSz66BfntCBALtQNCAFK2; _abck=F05779F2345218EA4989FF467D897C5A~0~YAAQ1Cv2SExfMBN6AQAAItxfEwaIwCrBeP25JBhBb7TX+HmnLQgrj1TkosrB+oHSv9ctrxRukqEDUaHPL1KkjpqjY1XY1yyulQ0ZRhsEfhY968YVsTOqfiosAu3kykd3pJG/bQ37XHwWs5qXpIdhMXRwJwXmkYtl3ETG8kXK2iZ22Q31COaSjNVACLaa7s9tCk9ItgLvUj5x9Nldjnd8AdXR0pXicrQY1IaruJyNqwMcJv42AUHW7iH4Ex9ZOSYsgEjLMNd44mS525X/gSNUTSOzoqoWsnH4MU59vfgLTwc2hVncAv67LBViTLxbWw4eVAvz7Z5phQfCmvoIy0PD8gy5iwPDMaD3GASrK9xScDPAPUI2wquxmSJ+f2cQaxZQKhvJCeH9cz14OZfx8ksA2ss53E0l0kDvgmnw~-1~-1~-1; ak_bmsc=BA4817D8DEE20E92C1E6251C54FC124348F62BD48F5F00005F91C9608B679D5F~plUkbYfsvYr5dCayJ9dMGEJ3QDgkmkv2mLpE7pCY9vW0xrdawvmyxfSnupw/4F7C48Akdn8PKsBniqz+7F+RZb8v4AkvH3c0RuvnynqJoni+kJcDYtPOxdMvdtGdTlZGIkSQNfpcxHNQDVlzojdSBX0vyBh/8seKQv10U67M7m787olYzg9jnsUwk3/VHBrnMDogiWJT8rNV7saSXunN0pAgucZWo/XhCpTJL+tI9urt0=; MCDCountry_code=US; bm_mi=BEE06312635FD442995BC0237BAFDA7C~f/RxgMW/JJSUc/wB9ZRg9fPD/76+wq/TaoWEZR1/ttrAiVTO256xhDTsVYc/kdHIjWkxvfO4XDcBjqe4hQ4qXt8Anpfi09vna/zcC7l6OVWpWeRSoZNztl7h5VF407L3XG+9CpzjSHNcaqAPRk5d0J5gLMtL/KmR8XBkAC0Syim7ST97nxNrPfLdlkSPMGm4Oy86xvY5PH5Nu47zS/gwhanBFg69tAdrQdaZewE2eGuzoJPsZit3UsihTzhXc4LY92hfSdh3/kZRId+NE8Jp0w==; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN4a9yst3R170rBCm1egzGvCBmB1jq9aCwQm5VgIJgloPOdpiIPfD3kDxFbKhqMuS5U=; JSESSIONID=64PZkBXhhpvNjM4NganzSZ0r1npIIaM7Fo84EsxN.eap7node7; _abck=F05779F2345218EA4989FF467D897C5A~-1~YAAQ1Cv2SExyMBN6AQAA5Et0EwZueCejZbKz1VDGCq2sB43Yx4dq0SiiGeUS6gVpXRIdw3rA3OdpNGHq7tVzQ+IvPpEKwLML9736x1qB5SQxV3jai89y2B2QF6K8nKtyrDAes0qbeTyIrHu0Rh1HLs7CjNxiLi0wswbCZfSsPI6fJZiEt+Itre3lfmua/HkhIRwpVTKqlVN5eQ8XIX+s1jJbINx/jUmMTW+jB5k4A5NARGChYH7rJQGYIT/oyZYpSbS3Yweqa4FRgGMW4gYZBN39+t2xSfewADLdpihfOnoZtakw9VhcvAKaf4mEzjB7WEfNJIZSjSE8DzvbJNIF41MGuAhhrnEBwBE8uVCZsA+2qjVPSADVp2Nn8JanJXCbucnLFOLsmPz3oVtGzentht1cHog4+eYOUlmw~0~-1~-1; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN5ZCTzA250oKEeVeXaa6j4gEGJ9RRtrTXQdYXzzSx6fM9aLwif+We2vtIc1yLQgTt4=',
'dnt': '1'
}
response = requests.request("GET", url, headers = headers, data = payload, files = files)
stores = json.loads(response.text)
with open('Mcdonlocation.csv', mode='w') as CSVFile:
writer = csv.writer(CSVFile, delimiter=",", quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow([
"addressLine1",
"addressLine2",
"addressLine3",
"subDivision",
"postcode",
"telephone"
])
for store in stores['features']:
row = []
Match_Address1= store['properties']["addressLine1"]
Match_Address2= store['properties']["addressLine2"]
Match_Address3= store['properties']["addressLine3"]
subDivision= store['properties']["subDivision"]
Postalcode= store['properties']["postcode"]
telephone= store['properties']["telephone"]
row.append(Match_Address1)
row.append(Match_Address2)
row.append(Match_Address3)
row.append(subDivision)
row.append(Postalcode)
row.append(telephone)
writer.writerow(row)
我认为您问题的基本答案是“看类型”。 Python json conversion table 告诉您每种类型的期望值。根据 Python 解释器,让我们加载您的文件并查看我们有什么:
>>> input = dat.read()
>>> stores = json.loads(input)
>>> type(stores)
<class 'dict'>
>>> type(stores['features'])
<class 'list'>
>>> type( stores['features'][0] )
<class 'dict'>
>>> type( stores['features'][0]['properties'] )
<class 'dict'>
>>> type( stores['features'][0]['properties']['telephone'] )
<class 'str'>
>>> stores['features'][0]['properties']['telephone']
'4166903659'
每个对象都有一个类型;每种类型都有方法。只需按照列表的顺序进行操作即可。
看起来您的数据结构如下:
features:
- geometry:
coordinates:
- float
- float
properties:
addressLine1: str
addressLine2: str
addressLine3: str
addressLine4: str
customAddress: str
driveTodayHours: str
filterType:
- str
- ...
id: str
longDescription: str
name: str
postcode: str
shortDescription: str
subDivision: str
telephone: str
todayHours: str
您想从各个“功能”元素中提取信息,这些元素似乎是商店,因此您可以使用现有逻辑这样的代码开始一个良好的开端:
for store in data['features']:
csv_row = process_store(store)
从那里您只需要决定要从信息中提取什么,例如:
coord_1 = store['coordinates'][0]
custom_address = store['properties']['customAddress']
...
。与您现有的代码相比,我认为您只是没有注意到有一个 properties
属性可以访问超出初始 features
级别的内容。
代码的目标是获取 json 我获取的信息并将其解析为原始位置数据(地址、邮政等)。我对编码很陌生,这是我在学习地理时遇到的一项学校项目的一次性任务,并且需要 Canada.So 中所有麦当劳的位置,欢迎使用任何其他学习工具。但是我遇到的主要问题是我想写
for blank in blanks['']:
这样我就可以获取 csv 输出的数据。但是我注意到我的数据在多层下。 例如:
{
"features": [
{
"geometry": {
"coordinates": [
-79.28662,
43.68758
]
},
"properties": {
"name": "Vic Park/Gerrard",
"shortDescription": "VIC PARK/G",
"longDescription": "VIC PARK/GERRARD",
"todayHours": "06:00 - 22:00",
"driveTodayHours": "00:00 - 00:00",
"id": "195500517230-en-ca",
"filterType": [
"ALL_DAY_BREAKFAST",
"BAKERY",
"BREAKFAST",
"CYT",
"DRIVETHRU",
"INDOORDINING",
"MCCAFE",
"MOBILEOFFERS",
"MOBILEORDERS",
"PARKINGAREA",
"TWENTYFOURHOURS",
"WIFI"
],
"addressLine1": "2480 GERRARD STREET EAST",
"addressLine2": "",
"addressLine3": "SCARBOROUGH",
"addressLine4": "Canada",
"subDivision": "",
"postcode": "M1N 4C3",
"customAddress": "SCARBOROUGH, M1N 4C3",
"telephone": "4166903659",
我想要的信息在我看来(不确定)的属性下,但我的
for store in stores['features']:
声明。不允许我为 csv 单独获取 'addressLine1' 信息或其他信息。我想知道是否有人有解决方案来解析这样的数据。
P.S 我包含了我的全部代码以防出现更深层次的问题。
import requests
import csv
import json
url = "https://www.mcdonalds.com/googleapps/GoogleRestaurantLocAction.do?method=searchLocation&latitude=43.6936965&longitude=-79.2969938&radius=1000000&maxResults=1700&country=ca&language=en-ca&showClosed=&hours24Text=Open%2024%20hr"
payload={}
files={}
headers = {
'authority': 'www.mcdonalds.com',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'accept': '*/*',
'x-requested-with': 'XMLHttpRequest',
'sec-ch-ua-mobile': '?0',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://www.mcdonalds.com/ca/en-ca/restaurant-locator.html',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'cookie': 'bm_sz=C04645E7F7A956C5F9D9C5A20DEAEC97~YAAQ1Cv2SEtfMBN6AQAAItxfEwwTVV2V2Tr7UWpPt1Ps7gl84FzQlmbWIm4kBBh5dxlK3w8RenwiEiKtvERE6dLmrwPwJUuy+14gU/LeEZvP+uxzyBr04oQXdcSEQuiOgdkAGasqnBrTw1mp5E5iehnRpvHBDdSqh8wRSgJV0eG4f8YwSz66BfntCBALtQNCAFK2; _abck=F05779F2345218EA4989FF467D897C5A~0~YAAQ1Cv2SExfMBN6AQAAItxfEwaIwCrBeP25JBhBb7TX+HmnLQgrj1TkosrB+oHSv9ctrxRukqEDUaHPL1KkjpqjY1XY1yyulQ0ZRhsEfhY968YVsTOqfiosAu3kykd3pJG/bQ37XHwWs5qXpIdhMXRwJwXmkYtl3ETG8kXK2iZ22Q31COaSjNVACLaa7s9tCk9ItgLvUj5x9Nldjnd8AdXR0pXicrQY1IaruJyNqwMcJv42AUHW7iH4Ex9ZOSYsgEjLMNd44mS525X/gSNUTSOzoqoWsnH4MU59vfgLTwc2hVncAv67LBViTLxbWw4eVAvz7Z5phQfCmvoIy0PD8gy5iwPDMaD3GASrK9xScDPAPUI2wquxmSJ+f2cQaxZQKhvJCeH9cz14OZfx8ksA2ss53E0l0kDvgmnw~-1~-1~-1; ak_bmsc=BA4817D8DEE20E92C1E6251C54FC124348F62BD48F5F00005F91C9608B679D5F~plUkbYfsvYr5dCayJ9dMGEJ3QDgkmkv2mLpE7pCY9vW0xrdawvmyxfSnupw/4F7C48Akdn8PKsBniqz+7F+RZb8v4AkvH3c0RuvnynqJoni+kJcDYtPOxdMvdtGdTlZGIkSQNfpcxHNQDVlzojdSBX0vyBh/8seKQv10U67M7m787olYzg9jnsUwk3/VHBrnMDogiWJT8rNV7saSXunN0pAgucZWo/XhCpTJL+tI9urt0=; MCDCountry_code=US; bm_mi=BEE06312635FD442995BC0237BAFDA7C~f/RxgMW/JJSUc/wB9ZRg9fPD/76+wq/TaoWEZR1/ttrAiVTO256xhDTsVYc/kdHIjWkxvfO4XDcBjqe4hQ4qXt8Anpfi09vna/zcC7l6OVWpWeRSoZNztl7h5VF407L3XG+9CpzjSHNcaqAPRk5d0J5gLMtL/KmR8XBkAC0Syim7ST97nxNrPfLdlkSPMGm4Oy86xvY5PH5Nu47zS/gwhanBFg69tAdrQdaZewE2eGuzoJPsZit3UsihTzhXc4LY92hfSdh3/kZRId+NE8Jp0w==; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN4a9yst3R170rBCm1egzGvCBmB1jq9aCwQm5VgIJgloPOdpiIPfD3kDxFbKhqMuS5U=; JSESSIONID=64PZkBXhhpvNjM4NganzSZ0r1npIIaM7Fo84EsxN.eap7node7; _abck=F05779F2345218EA4989FF467D897C5A~-1~YAAQ1Cv2SExyMBN6AQAA5Et0EwZueCejZbKz1VDGCq2sB43Yx4dq0SiiGeUS6gVpXRIdw3rA3OdpNGHq7tVzQ+IvPpEKwLML9736x1qB5SQxV3jai89y2B2QF6K8nKtyrDAes0qbeTyIrHu0Rh1HLs7CjNxiLi0wswbCZfSsPI6fJZiEt+Itre3lfmua/HkhIRwpVTKqlVN5eQ8XIX+s1jJbINx/jUmMTW+jB5k4A5NARGChYH7rJQGYIT/oyZYpSbS3Yweqa4FRgGMW4gYZBN39+t2xSfewADLdpihfOnoZtakw9VhcvAKaf4mEzjB7WEfNJIZSjSE8DzvbJNIF41MGuAhhrnEBwBE8uVCZsA+2qjVPSADVp2Nn8JanJXCbucnLFOLsmPz3oVtGzentht1cHog4+eYOUlmw~0~-1~-1; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN5ZCTzA250oKEeVeXaa6j4gEGJ9RRtrTXQdYXzzSx6fM9aLwif+We2vtIc1yLQgTt4=',
'dnt': '1'
}
response = requests.request("GET", url, headers = headers, data = payload, files = files)
stores = json.loads(response.text)
with open('Mcdonlocation.csv', mode='w') as CSVFile:
writer = csv.writer(CSVFile, delimiter=",", quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow([
"addressLine1",
"addressLine2",
"addressLine3",
"subDivision",
"postcode",
"telephone"
])
for store in stores['features']:
row = []
Match_Address1= store['properties']["addressLine1"]
Match_Address2= store['properties']["addressLine2"]
Match_Address3= store['properties']["addressLine3"]
subDivision= store['properties']["subDivision"]
Postalcode= store['properties']["postcode"]
telephone= store['properties']["telephone"]
row.append(Match_Address1)
row.append(Match_Address2)
row.append(Match_Address3)
row.append(subDivision)
row.append(Postalcode)
row.append(telephone)
writer.writerow(row)
我认为您问题的基本答案是“看类型”。 Python json conversion table 告诉您每种类型的期望值。根据 Python 解释器,让我们加载您的文件并查看我们有什么:
>>> input = dat.read()
>>> stores = json.loads(input)
>>> type(stores)
<class 'dict'>
>>> type(stores['features'])
<class 'list'>
>>> type( stores['features'][0] )
<class 'dict'>
>>> type( stores['features'][0]['properties'] )
<class 'dict'>
>>> type( stores['features'][0]['properties']['telephone'] )
<class 'str'>
>>> stores['features'][0]['properties']['telephone']
'4166903659'
每个对象都有一个类型;每种类型都有方法。只需按照列表的顺序进行操作即可。
看起来您的数据结构如下:
features:
- geometry:
coordinates:
- float
- float
properties:
addressLine1: str
addressLine2: str
addressLine3: str
addressLine4: str
customAddress: str
driveTodayHours: str
filterType:
- str
- ...
id: str
longDescription: str
name: str
postcode: str
shortDescription: str
subDivision: str
telephone: str
todayHours: str
您想从各个“功能”元素中提取信息,这些元素似乎是商店,因此您可以使用现有逻辑这样的代码开始一个良好的开端:
for store in data['features']:
csv_row = process_store(store)
从那里您只需要决定要从信息中提取什么,例如:
coord_1 = store['coordinates'][0]
custom_address = store['properties']['customAddress']
...
。与您现有的代码相比,我认为您只是没有注意到有一个 properties
属性可以访问超出初始 features
级别的内容。