将子 OrderedDict 转换为 DataFrame
Converting a sub OrderedDict to a DataFrame
我正在使用 Jupyter 并访问了 Airtable 的 API 的数据。它现在存储为多个 OrderedDict。我需要将这些数据转换成单独的数据帧。
OrderedDict([('records',
[OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
('fields', OrderedDict()),
('createdTime', '2018-05-18T05:36:54.000Z')]),
OrderedDict([('id', 'rec13WqEutT0SwIP0'),
('fields',
OrderedDict([('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')])),
我尝试了以下代码,将所有内容转换为单个数据帧。
df = pd.DataFrame.from_dict(data)
当我执行这段代码时,它会产生以下结果:
records offset
0 {'id': 'rec0O8L1dlrobrPtj', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
1 {'id': 'rec13WqEutT0SwIP0', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
2 {'id': 'rec22sGXgPU9hFbTq', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
3 {'id': 'rec2a4MQL24dQhGzI', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
4 {'id': 'rec3VBhG7u55BQsFy', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
我需要访问第三个缩进中的 OrderedDict(即
('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')])),
我怎样才能访问子 OrderedDict 并将其转换为数据框?
这是一种方法。
演示:
from collections import OrderedDict
import pandas as pd
data = OrderedDict([('records',
[OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
('fields', OrderedDict()),
('createdTime', '2018-05-18T05:36:54.000Z')]),
OrderedDict([('id', 'rec13WqEutT0SwIP0'),
('fields',
OrderedDict([('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')]))])
]
)])
df = pd.DataFrame([d["fields"] for d in data["records"]])
print(df)
输出:
Lead ID Company Name \
0 NaN NaN
1 64556 CesKath (Ukay-Ukay) / KRKK Online Shop
Client Name Principal Defendant Name/s \
0 NaN NaN
1 Kamille Rona Venturina Taytay n/a
Co-Defendant Name/s Plaintiff Nature of Case Trial Court City/Province \
0 NaN NaN NaN NaN NaN
1 n/a n/a n/a n/a n/a
Sala No. Case Number Case Status Address
0 NaN NaN NaN NaN
1 n/a n/a n/a n/a
我正在使用 Jupyter 并访问了 Airtable 的 API 的数据。它现在存储为多个 OrderedDict。我需要将这些数据转换成单独的数据帧。
OrderedDict([('records',
[OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
('fields', OrderedDict()),
('createdTime', '2018-05-18T05:36:54.000Z')]),
OrderedDict([('id', 'rec13WqEutT0SwIP0'),
('fields',
OrderedDict([('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')])),
我尝试了以下代码,将所有内容转换为单个数据帧。
df = pd.DataFrame.from_dict(data)
当我执行这段代码时,它会产生以下结果:
records offset
0 {'id': 'rec0O8L1dlrobrPtj', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
1 {'id': 'rec13WqEutT0SwIP0', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
2 {'id': 'rec22sGXgPU9hFbTq', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
3 {'id': 'rec2a4MQL24dQhGzI', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
4 {'id': 'rec3VBhG7u55BQsFy', itr67AuLTHCfW40zH/recblaoEXrMrbx7Yt
我需要访问第三个缩进中的 OrderedDict(即
('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')])),
我怎样才能访问子 OrderedDict 并将其转换为数据框?
这是一种方法。
演示:
from collections import OrderedDict
import pandas as pd
data = OrderedDict([('records',
[OrderedDict([('id', 'rec0O8L1dlrobrPtj'),
('fields', OrderedDict()),
('createdTime', '2018-05-18T05:36:54.000Z')]),
OrderedDict([('id', 'rec13WqEutT0SwIP0'),
('fields',
OrderedDict([('Lead ID', '64556'),
('Company Name',
'CesKath (Ukay-Ukay) / KRKK Online Shop'),
('Client Name',
'Kamille Rona Venturina Taytay'),
('Principal Defendant Name/s',
'n/a'),
('Co-Defendant Name/s', 'n/a'),
('Plaintiff', 'n/a'),
('Nature of Case', 'n/a'),
('Trial Court', 'n/a'),
('City/Province', 'n/a'),
('Sala No.', 'n/a'),
('Case Number', 'n/a'),
('Case Status', 'n/a'),
('Address', 'n/a')]))])
]
)])
df = pd.DataFrame([d["fields"] for d in data["records"]])
print(df)
输出:
Lead ID Company Name \
0 NaN NaN
1 64556 CesKath (Ukay-Ukay) / KRKK Online Shop
Client Name Principal Defendant Name/s \
0 NaN NaN
1 Kamille Rona Venturina Taytay n/a
Co-Defendant Name/s Plaintiff Nature of Case Trial Court City/Province \
0 NaN NaN NaN NaN NaN
1 n/a n/a n/a n/a n/a
Sala No. Case Number Case Status Address
0 NaN NaN NaN NaN
1 n/a n/a n/a n/a