如何从 python 中的 excel 创建具有多列的字典?
How to create dictionary with multiple column from excel in python?
我想使用 python 从 excel 文件导入的值创建字典,excel 列文件如下所示:
University
Year
IUB
2013
IUB
2013
IUB
2013
IUB
2014
IUB
2015
BZU
2013
BZU
2013
BZU
2014
UCP
2016
UCP
2016
UCP
2013
UCP
2014
输出应如下所示:
'IUB': {'2013': '3', '2014': '1', '2015': '1'},
'BZU': {'2013': '2', '2014': '1'},
'UCP': {'2013': '1', '2014': '1', '2016': '2'}
您可以使用 pandas 来阅读您的 Excel 文件。然后用groupby
('大学,'Year')和agg
计算每个University/Year.
的计数
使用 pivot
格式化您的 DataFrame,然后导出到字典:
import pandas as pd
df = pd.read_excel("your_excel_file.xlsx")
df['count'] = 0
df = df.groupby(['University', 'Year'], as_index=False)['count'].agg('count')
df = df.pivot(index="Year", columns="University", values="count")
output = df.to_dict()
print(output)
输出:
{'BZU': {2013: 2.0, 2014: 1.0, 2015: nan, 2016: nan}, 'IUB': {2013: 3.0, 2014: 1.0, 2015: 1.0, 2016: nan}, 'UCP': {2013: 1.0, 2014: 1.0, 2015: nan, 2016: 2.0}}
如有必要,您必须手动删除 nan
值:
for uni, year in output.items():
for y, count in list(year.items()):
if pd.isna(count):
del year[y]
print(output)
输出:
{'BZU': {2013: 2.0, 2014: 1.0}, 'IUB': {2013: 3.0, 2014: 1.0, 2015: 1.0}, 'UCP': {2013: 1.0, 2014: 1.0, 2016: 2.0}}
我想使用 python 从 excel 文件导入的值创建字典,excel 列文件如下所示:
University | Year |
---|---|
IUB | 2013 |
IUB | 2013 |
IUB | 2013 |
IUB | 2014 |
IUB | 2015 |
BZU | 2013 |
BZU | 2013 |
BZU | 2014 |
UCP | 2016 |
UCP | 2016 |
UCP | 2013 |
UCP | 2014 |
输出应如下所示:
'IUB': {'2013': '3', '2014': '1', '2015': '1'},
'BZU': {'2013': '2', '2014': '1'},
'UCP': {'2013': '1', '2014': '1', '2016': '2'}
您可以使用 pandas 来阅读您的 Excel 文件。然后用groupby
('大学,'Year')和agg
计算每个University/Year.
使用 pivot
格式化您的 DataFrame,然后导出到字典:
import pandas as pd
df = pd.read_excel("your_excel_file.xlsx")
df['count'] = 0
df = df.groupby(['University', 'Year'], as_index=False)['count'].agg('count')
df = df.pivot(index="Year", columns="University", values="count")
output = df.to_dict()
print(output)
输出:
{'BZU': {2013: 2.0, 2014: 1.0, 2015: nan, 2016: nan}, 'IUB': {2013: 3.0, 2014: 1.0, 2015: 1.0, 2016: nan}, 'UCP': {2013: 1.0, 2014: 1.0, 2015: nan, 2016: 2.0}}
如有必要,您必须手动删除 nan
值:
for uni, year in output.items():
for y, count in list(year.items()):
if pd.isna(count):
del year[y]
print(output)
输出:
{'BZU': {2013: 2.0, 2014: 1.0}, 'IUB': {2013: 3.0, 2014: 1.0, 2015: 1.0}, 'UCP': {2013: 1.0, 2014: 1.0, 2016: 2.0}}