如何将 pandas 数据框从宽拆分为高
how to split a pandas dataframe from wide to tall shape
我有一个包含此结构的数据框,我已经想出了如何通过这样做 'unpivot' df,但我很确定这不是我想要的更 pythonic 的方式。你能建议一个更好的方法吗?:
v = [[{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D1[i], 'BIN_ID_IHS': df1.ID1[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D2[i], 'BIN_ID_IHS': df1.ID2[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D3[i], 'BIN_ID_IHS': df1.ID3[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D4[i], 'BIN_ID_IHS': df1.ID4[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D5[i], 'BIN_ID_IHS': df1.ID5[i]}]
for i in df1.index]
数据框:
D1 D2 D3 D4 D5 ID1 ID2 ID3 ID4 ID5
WMAC
258403 0.002665 0.003306 0.001396 0.003395 0.003741 100000141725 100000141709 100000141696 100000141676 100000141294
105692 0.000016 0.000257 0.000264 0.000298 0.000349 100000030110 100000030243 100000030109 100000030166 100000323212
70795 0.001588 0.001564 0.000019 0.001828 0.001828 100000040111 100000028683 100000034744 100000324405 100000038952
IIUC,使用pd.wide_to_long
:
pd.wide_to_long(df, ['D', 'ID'], 'WMAC', 'No')
输出:
D ID
WMAC No
258403 1 0.002665 100000141725
105692 1 0.000016 100000030110
70795 1 0.001588 100000040111
258403 2 0.003306 100000141709
105692 2 0.000257 100000030243
70795 2 0.001564 100000028683
258403 3 0.001396 100000141696
105692 3 0.000264 100000030109
70795 3 0.000019 100000034744
258403 4 0.003395 100000141676
105692 4 0.000298 100000030166
70795 4 0.001828 100000324405
258403 5 0.003741 100000141294
105692 5 0.000349 100000323212
70795 5 0.001828 100000038952
我有一个包含此结构的数据框,我已经想出了如何通过这样做 'unpivot' df,但我很确定这不是我想要的更 pythonic 的方式。你能建议一个更好的方法吗?:
v = [[{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D1[i], 'BIN_ID_IHS': df1.ID1[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D2[i], 'BIN_ID_IHS': df1.ID2[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D3[i], 'BIN_ID_IHS': df1.ID3[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D4[i], 'BIN_ID_IHS': df1.ID4[i]},
{'BIN_ID_WDM': i, 'DSIMILARITY': df1.D5[i], 'BIN_ID_IHS': df1.ID5[i]}]
for i in df1.index]
数据框:
D1 D2 D3 D4 D5 ID1 ID2 ID3 ID4 ID5
WMAC
258403 0.002665 0.003306 0.001396 0.003395 0.003741 100000141725 100000141709 100000141696 100000141676 100000141294
105692 0.000016 0.000257 0.000264 0.000298 0.000349 100000030110 100000030243 100000030109 100000030166 100000323212
70795 0.001588 0.001564 0.000019 0.001828 0.001828 100000040111 100000028683 100000034744 100000324405 100000038952
IIUC,使用pd.wide_to_long
:
pd.wide_to_long(df, ['D', 'ID'], 'WMAC', 'No')
输出:
D ID
WMAC No
258403 1 0.002665 100000141725
105692 1 0.000016 100000030110
70795 1 0.001588 100000040111
258403 2 0.003306 100000141709
105692 2 0.000257 100000030243
70795 2 0.001564 100000028683
258403 3 0.001396 100000141696
105692 3 0.000264 100000030109
70795 3 0.000019 100000034744
258403 4 0.003395 100000141676
105692 4 0.000298 100000030166
70795 4 0.001828 100000324405
258403 5 0.003741 100000141294
105692 5 0.000349 100000323212
70795 5 0.001828 100000038952