尝试 Select jsonl data column in another columns with .loc 但得到 KeyError 即使密钥存在

Try to Select jsonl data column in another columns with .loc but got KeyError even though the key exists

这是我的jsonl数据结构

"content": "Not yall gassing up a gay boy with no rhythm", "place": {"_type": "snscrape.modules.twitter.Place", "fullName": "Manhattan, NY", "name": "Manhattan", "type": "city", "country": "United States", "countryCode": "US"}

我尝试使用此代码select 来自地方列的国家/地区代码

country_df = test_df.loc[test_df['place'].notnull(), ['content', 'place']]
countrycode_df = country_df["place"].loc["countryCode"]

但它给了我这个错误

按键错误:'countryCode'

我该如何解决这个问题?

我试过这个method但它不适合我的情况

您可以通过 str 访问它:

country_df['place'].str['countryCode']

输出:

0    US
Name: place, dtype: object

因为“place”基本上是一个dict(一个嵌套的字典),你可以像更高级别的访问它dict

country = {"content": "Not yall gassing up a gay boy with no rhythm", "place": {"_type": "snscrape.modules.twitter.Place", "fullName": "Manhattan, NY", "name": "Manhattan", "type": "city", "country": "United States", "countryCode": "US"}}
country["place"]["countryCode"]

输出:

'US'

但是,使用 pandas json_normalize():

可能更符合您的目的
country_df = pd.json_normalize(data = country)

print(country_df )

输出:

content place._type place.fullName place.name place.type place.country place.countryCode
Not yall gassing up a gay boy with no rhythm snscrape.modules.twitter.Place Manhattan, NY Manhattan city United States US