如何将所有非零值转换为 pandas 中的新列
How to transform all non-zero values to a new column in pandas
我有一个包含 81 列和大约 3000 行的数据框。下面是一个样本 df。我需要将所有非零值转换为新列
TO Rubber Wood Plastic Toy Metal Paper Result
AAA 0 0 0 0 0 9 1
BBB 60 0 0 0 0 0 -1
CCC 0 0.8 0 0 0 0 1
DDD 0 0 0 40 0 0 1
EEE 0 0 7 0 0 0 1
FFF 0 0 0 0 10 0 -1
我已尝试将列名转换为新列,但无法对值进行转换
df['Mat'] = (df.iloc[:, 1:82] != 0).idxmax(1)
我需要的结果:
TO Rubber Wood Plastic Toy Metal Paper Result WT Mat
AAA 0 0 0 0 0 9 1 60 Rubber
BBB 60 0 0 0 0 0 -1 0.8 Wood
CCC 0 0.8 0 0 0 0 1 7 Plastic
DDD 0 0 0 40 0 0 1 40 Toy
EEE 0 0 7 0 0 0 1 10 Metal
FFF 0 0 0 0 10 0 -1 9 Paper
我想删除不必要的列,然后最终结果应该是
To Wt Mat
AAA 60 Rubber
BBB 0.8 Wood
CCC 7 Plastic
DDD 40 Toy
EEE 10 Metal
FFF 9 Paper
df = df.set_index(['TO']).sum().reset_index()[:6].rename({'index':'Mat',0:'Wt'},axis=1).join(df['TO'])
##df[['TO','Wt','Mat']]
TO Wt Mat
0 AAA 60.0 Rubber
1 BBB 0.8 Wood
2 CCC 7.0 Plastic
3 DDD 40.0 Toy
4 EEE 10.0 Metal
5 FFF 9.0 Paper
Select 找到的值的所有列 sum
:
#1:7 by sample data, in real data seems 1:82
df['Wt'] = (df.iloc[:, 1:7]).sum(1)
df['Mat'] = (df.iloc[:, 1:82] != 0).idxmax(1)
#last select only necessary columns by list
df = df[['TO','Wt','Mat']]
print (df)
TO Wt Mat
0 AAA 9.0 Paper
1 BBB 60.0 Rubber
2 CCC 0.8 Wood
3 DDD 40.0 Toy
4 EEE 7.0 Plastic
5 FFF 10.0 Metal
我有一个包含 81 列和大约 3000 行的数据框。下面是一个样本 df。我需要将所有非零值转换为新列
TO Rubber Wood Plastic Toy Metal Paper Result
AAA 0 0 0 0 0 9 1
BBB 60 0 0 0 0 0 -1
CCC 0 0.8 0 0 0 0 1
DDD 0 0 0 40 0 0 1
EEE 0 0 7 0 0 0 1
FFF 0 0 0 0 10 0 -1
我已尝试将列名转换为新列,但无法对值进行转换
df['Mat'] = (df.iloc[:, 1:82] != 0).idxmax(1)
我需要的结果:
TO Rubber Wood Plastic Toy Metal Paper Result WT Mat
AAA 0 0 0 0 0 9 1 60 Rubber
BBB 60 0 0 0 0 0 -1 0.8 Wood
CCC 0 0.8 0 0 0 0 1 7 Plastic
DDD 0 0 0 40 0 0 1 40 Toy
EEE 0 0 7 0 0 0 1 10 Metal
FFF 0 0 0 0 10 0 -1 9 Paper
我想删除不必要的列,然后最终结果应该是
To Wt Mat
AAA 60 Rubber
BBB 0.8 Wood
CCC 7 Plastic
DDD 40 Toy
EEE 10 Metal
FFF 9 Paper
df = df.set_index(['TO']).sum().reset_index()[:6].rename({'index':'Mat',0:'Wt'},axis=1).join(df['TO'])
##df[['TO','Wt','Mat']]
TO Wt Mat
0 AAA 60.0 Rubber
1 BBB 0.8 Wood
2 CCC 7.0 Plastic
3 DDD 40.0 Toy
4 EEE 10.0 Metal
5 FFF 9.0 Paper
Select 找到的值的所有列 sum
:
#1:7 by sample data, in real data seems 1:82
df['Wt'] = (df.iloc[:, 1:7]).sum(1)
df['Mat'] = (df.iloc[:, 1:82] != 0).idxmax(1)
#last select only necessary columns by list
df = df[['TO','Wt','Mat']]
print (df)
TO Wt Mat
0 AAA 9.0 Paper
1 BBB 60.0 Rubber
2 CCC 0.8 Wood
3 DDD 40.0 Toy
4 EEE 7.0 Plastic
5 FFF 10.0 Metal