使用附加列从宽到长转换和重塑数据框
Transform and reshape a Data Frame from wide to long with additional column
我有一个数据框,我想将其从宽格式转换为长格式。但我不想使用所有列。
具体来说,我想融化以下数据框
import pandas as pd
data = {'year': [2014, 2018,2020,2017],
'model':[12, 14,21,8],
'amount': [100, 120,80,210],
'quality': ["low", "high","medium","high"]
}
# pass column names in the columns parameter
df = pd.DataFrame.from_dict(data)
print(df)
进入此数据框:
data2 = {'year': [2014, 2014, 2018, 2018, 2020, 2020, 2017, 2017],
'variable': ["model", "amount", "model", "amount", "model", "amount", "model", "amount"],
'value':[12, 100, 14, 120, 21, 80, 8, 210],
'quality': ["low", "low", "high", "high", "medium", "medium", "high", "high"]
}
# pass column names in the columns parameter
df2 = pd.DataFrame.from_dict(data2)
print(df2)
我尝试 pd.melt() 使用不同的输入参数组合,如果我不考虑 quality 列,它会以某种方式工作。但是根据结果,我不能跳过 quality 列。此外,我尝试了 df.pivot()、df.pivot_table() 和 pd.wide_to_long()。所有在几个组合中。但不知何故,我没有得到想要的结果。也许在执行任何 pd.melt() 操作之前将 year 和 quality 列推入数据框索引会有所帮助?
非常感谢您的提前帮助!
import pandas as pd
data = {'year': [2014, 2018,2020,2017],
'model':[12, 14,21,8],
'amount': [100, 120,80,210],
'quality': ["low", "high","medium","high"]
}
# pass column names in the columns parameter
df = pd.DataFrame.from_dict(data)
print(df)
data2 = {'year': [2014, 2014, 2018, 2018, 2020, 2020, 2017, 2017],
'variable': ["model", "amount", "model", "amount", "model", "amount", "model", "amount"],
'value':[12, 100, 14, 120, 21, 80, 8, 210],
'quality': ["low", "low", "high", "high", "medium", "medium", "high", "high"]
}
# pass column names in the columns parameter
df2 = pd.DataFrame.from_dict(data2)
print(df2)
df3 = pd.melt(df, id_vars=['year', 'quality'], var_name='variable', value_name='value')
df3 = df3[['year', 'variable', 'value', 'quality']]
df3.sort_values('year', inplace=True)
print(df3)
输出(对于 df3):
year variable value quality
0 2014 model 12 low
4 2014 amount 100 low
3 2017 model 8 high
7 2017 amount 210 high
1 2018 model 14 high
5 2018 amount 120 high
2 2020 model 21 medium
6 2020 amount 80 medium
我有一个数据框,我想将其从宽格式转换为长格式。但我不想使用所有列。
具体来说,我想融化以下数据框
import pandas as pd
data = {'year': [2014, 2018,2020,2017],
'model':[12, 14,21,8],
'amount': [100, 120,80,210],
'quality': ["low", "high","medium","high"]
}
# pass column names in the columns parameter
df = pd.DataFrame.from_dict(data)
print(df)
进入此数据框:
data2 = {'year': [2014, 2014, 2018, 2018, 2020, 2020, 2017, 2017],
'variable': ["model", "amount", "model", "amount", "model", "amount", "model", "amount"],
'value':[12, 100, 14, 120, 21, 80, 8, 210],
'quality': ["low", "low", "high", "high", "medium", "medium", "high", "high"]
}
# pass column names in the columns parameter
df2 = pd.DataFrame.from_dict(data2)
print(df2)
我尝试 pd.melt() 使用不同的输入参数组合,如果我不考虑 quality 列,它会以某种方式工作。但是根据结果,我不能跳过 quality 列。此外,我尝试了 df.pivot()、df.pivot_table() 和 pd.wide_to_long()。所有在几个组合中。但不知何故,我没有得到想要的结果。也许在执行任何 pd.melt() 操作之前将 year 和 quality 列推入数据框索引会有所帮助?
非常感谢您的提前帮助!
import pandas as pd
data = {'year': [2014, 2018,2020,2017],
'model':[12, 14,21,8],
'amount': [100, 120,80,210],
'quality': ["low", "high","medium","high"]
}
# pass column names in the columns parameter
df = pd.DataFrame.from_dict(data)
print(df)
data2 = {'year': [2014, 2014, 2018, 2018, 2020, 2020, 2017, 2017],
'variable': ["model", "amount", "model", "amount", "model", "amount", "model", "amount"],
'value':[12, 100, 14, 120, 21, 80, 8, 210],
'quality': ["low", "low", "high", "high", "medium", "medium", "high", "high"]
}
# pass column names in the columns parameter
df2 = pd.DataFrame.from_dict(data2)
print(df2)
df3 = pd.melt(df, id_vars=['year', 'quality'], var_name='variable', value_name='value')
df3 = df3[['year', 'variable', 'value', 'quality']]
df3.sort_values('year', inplace=True)
print(df3)
输出(对于 df3):
year variable value quality
0 2014 model 12 low
4 2014 amount 100 low
3 2017 model 8 high
7 2017 amount 210 high
1 2018 model 14 high
5 2018 amount 120 high
2 2020 model 21 medium
6 2020 amount 80 medium