Pandas json_normalize 函数未正确导出到 excel 文件
Pandas json_normalize function doesn't export to excel file correctly
我正在使用 json_normalize 函数,但它没有正确地将 JSON 导出到 excel 文件,您可以在下面看到。
这是我的代码:
import requests
import pandas as pd
urls = [
'https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=11764&examtype=B',
]
for url in urls:
r = requests.get(url=url).json()
print(r)
jn = pd.json_normalize(r)
df = pd.DataFrame(jn)
df.to_excel('data.xlsx')
您可以尝试将 jn = pd.json_normalize(r)
的代码替换为以下代码,以便将列 examInformation
下的嵌套 json 列表扩展到单独的行中:
使用.explode()
+ pd.Series
:
jn = jn.drop('examInformation', axis=1).join(jn.explode('examInformation').apply(lambda x: pd.Series(x['examInformation']), axis=1))
或使用:.explode()
+ pd.DataFrame
以加快执行速度:
jn_exp = jn['examInformation'].explode()
jn = jn.drop('examInformation', axis=1).join(pd.DataFrame(jn_exp.tolist(), index=jn_exp.index))
结果:
请参阅下方最右栏的扩展 examInformation
信息:
print(jn)
vehicleCategory type id name lat lon examInformationAllLocations.allAttempts examInformationAllLocations.successfulAllAttemptsPercentage contactInformation.streetName contactInformation.houseNumber contactInformation.houseNumberExtension contactInformation.zipCode contactInformation.city contactInformation.website contactInformation.email contactInformation.phone1 contactInformation.phone2 contactInformation.kvk contactInformation.drivingSchoolNumber contactInformation.tradeAssociations lessonTypes.Theorieopleidingen lessonTypes.Beroepsopleidingen lessonTypes.Bijzonderheden lessonTypes.Praktijkopleidingen cbrLocation cbrLocationShortName cbrLocationLink locationSuccessfulPercentage drivingSchoolSuccessfulPercentage firstAttempts successfulFirstAttemptsPercentage retakeAttempts successfulSecondAttemptsPercentage
0 B rijschool 11764 Rijschool Baron 51.694322 5.287478 81 43 Amperestraat 28 5223CV 'S-HERTOGENBOSCH 06 44 30 30 81 743056970000 1666U0 [] [] [] [] Examencentrum Tiel Tiel /nl/service/nl/artikel/examencentrum-tiel-1.htm 57 100 0 0 1 100
0 B rijschool 11764 Rijschool Baron 51.694322 5.287478 81 43 Amperestraat 28 5223CV 'S-HERTOGENBOSCH 06 44 30 30 81 743056970000 1666U0 [] [] [] [] Examencentrum Den Bosch Den Bosch /nl/service/nl/artikel/examencentrum-den-bosch.htm 54 42 30 43 50 42
我正在使用 json_normalize 函数,但它没有正确地将 JSON 导出到 excel 文件,您可以在下面看到。
这是我的代码:
import requests
import pandas as pd
urls = [
'https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=11764&examtype=B',
]
for url in urls:
r = requests.get(url=url).json()
print(r)
jn = pd.json_normalize(r)
df = pd.DataFrame(jn)
df.to_excel('data.xlsx')
您可以尝试将 jn = pd.json_normalize(r)
的代码替换为以下代码,以便将列 examInformation
下的嵌套 json 列表扩展到单独的行中:
使用.explode()
+ pd.Series
:
jn = jn.drop('examInformation', axis=1).join(jn.explode('examInformation').apply(lambda x: pd.Series(x['examInformation']), axis=1))
或使用:.explode()
+ pd.DataFrame
以加快执行速度:
jn_exp = jn['examInformation'].explode()
jn = jn.drop('examInformation', axis=1).join(pd.DataFrame(jn_exp.tolist(), index=jn_exp.index))
结果:
请参阅下方最右栏的扩展 examInformation
信息:
print(jn)
vehicleCategory type id name lat lon examInformationAllLocations.allAttempts examInformationAllLocations.successfulAllAttemptsPercentage contactInformation.streetName contactInformation.houseNumber contactInformation.houseNumberExtension contactInformation.zipCode contactInformation.city contactInformation.website contactInformation.email contactInformation.phone1 contactInformation.phone2 contactInformation.kvk contactInformation.drivingSchoolNumber contactInformation.tradeAssociations lessonTypes.Theorieopleidingen lessonTypes.Beroepsopleidingen lessonTypes.Bijzonderheden lessonTypes.Praktijkopleidingen cbrLocation cbrLocationShortName cbrLocationLink locationSuccessfulPercentage drivingSchoolSuccessfulPercentage firstAttempts successfulFirstAttemptsPercentage retakeAttempts successfulSecondAttemptsPercentage
0 B rijschool 11764 Rijschool Baron 51.694322 5.287478 81 43 Amperestraat 28 5223CV 'S-HERTOGENBOSCH 06 44 30 30 81 743056970000 1666U0 [] [] [] [] Examencentrum Tiel Tiel /nl/service/nl/artikel/examencentrum-tiel-1.htm 57 100 0 0 1 100
0 B rijschool 11764 Rijschool Baron 51.694322 5.287478 81 43 Amperestraat 28 5223CV 'S-HERTOGENBOSCH 06 44 30 30 81 743056970000 1666U0 [] [] [] [] Examencentrum Den Bosch Den Bosch /nl/service/nl/artikel/examencentrum-den-bosch.htm 54 42 30 43 50 42