Pandas如何合并多个数据列并输出为字典
Pandas How to combine multiple data columns and output as dictionary
我正在处理 csv 文件。
csv table 文件结构为
brands
models
2021_price
2020_price
chevrolet
Traverse
320000
24000
chevrolet
Equinox
23000
18000
chevrolet
Trailblazer
13000
14000
这是我自己试过的
json_dict = {}
for index,row in df.iterrows():
data=(
{row[0]:{
''.join(str(row[1])):
{
"2021":' '.join(str(row[2]).split()),
'2020':' '.join(str(row[3]).split()),
}
}
}
)
json_dict.update(data)
我得到这个作为输出
{
"chevrolet":{
"Traverse":{
"2021":"320000",
"2020":"24000",
},
"chevrolet":{
"Equinox":{
"2021":"23000",
"2020":"18000",
}
}
但预期的字典是
{
"chevrolet":{
"Traverse":{
"2021":"320000",
"2020":"24000"
},
"Equinox":{
"2021":"23000",
"2020":"18000"
}
}
这是文件样本
NISSAN Patrol Platinum City 1,260,000,000.00 UZS Nan
NISSAN Qasgqai 315,000,000.00 UZS 315,000,
NISSAN X-Trail 367,500,000.00 UZS Nan
如果我没理解错的话,您想按“品牌”分组,然后创建字典:
out = {}
for b, g in df.groupby("brands"):
out[b] = {
row["models"]: {
"2020": row["2020_price"],
"2021": row["2021_price"],
}
for _, row in g.iterrows()
}
print(out)
打印:
{
"chevrolet": {
"Traverse": {"2020": 24000, "2021": 320000},
"Equinox": {"2020": 18000, "2021": 23000},
"Trailblazer": {"2020": 14000, "2021": 13000},
}
}
我正在处理 csv 文件。 csv table 文件结构为
brands | models | 2021_price | 2020_price |
---|---|---|---|
chevrolet | Traverse | 320000 | 24000 |
chevrolet | Equinox | 23000 | 18000 |
chevrolet | Trailblazer | 13000 | 14000 |
这是我自己试过的
json_dict = {}
for index,row in df.iterrows():
data=(
{row[0]:{
''.join(str(row[1])):
{
"2021":' '.join(str(row[2]).split()),
'2020':' '.join(str(row[3]).split()),
}
}
}
)
json_dict.update(data)
我得到这个作为输出
{
"chevrolet":{
"Traverse":{
"2021":"320000",
"2020":"24000",
},
"chevrolet":{
"Equinox":{
"2021":"23000",
"2020":"18000",
}
}
但预期的字典是
{
"chevrolet":{
"Traverse":{
"2021":"320000",
"2020":"24000"
},
"Equinox":{
"2021":"23000",
"2020":"18000"
}
}
这是文件样本
NISSAN Patrol Platinum City 1,260,000,000.00 UZS Nan
NISSAN Qasgqai 315,000,000.00 UZS 315,000,
NISSAN X-Trail 367,500,000.00 UZS Nan
如果我没理解错的话,您想按“品牌”分组,然后创建字典:
out = {}
for b, g in df.groupby("brands"):
out[b] = {
row["models"]: {
"2020": row["2020_price"],
"2021": row["2021_price"],
}
for _, row in g.iterrows()
}
print(out)
打印:
{
"chevrolet": {
"Traverse": {"2020": 24000, "2021": 320000},
"Equinox": {"2020": 18000, "2021": 23000},
"Trailblazer": {"2020": 14000, "2021": 13000},
}
}