根据行中的数字将 pandas 数据框列转换为列表
Transform pandas dataframe columns to list according to number in row
我有一个这样的数据框:
Day Id Banana Apple
2020-01-01 1 1 1
2020-01-02 1 NaN 2
2020-01-03 2 2 2
如何将其转换为:
Day Id Banana Apple Products
2020-01-01 1 1 1 [Banana, Apple]
2020-01-02 1 NaN 2 [Apple, Apple]
2020-01-03 2 2 2 [Banana, Banana, Apple, Apple]
Select 所有没有前 2 的列按 DataFrame.iloc
, then reshape by DataFrame.stack
, repeat MultiIndex
by Index.repeat
的位置并聚合 list
s:
s = df.iloc[:, 2:].stack()
df['Products'] = s[s.index.repeat(s)].reset_index().groupby(['level_0'])['level_1'].agg(list)
print (df)
Day Id Banana Apple Products
0 2020-01-01 1 1.0 1 [Banana, Apple]
1 2020-01-02 1 NaN 2 [Apple, Apple]
2 2020-01-03 2 2.0 2 [Banana, Banana, Apple, Apple]
或者使用带有重复 columns
名称且没有缺失值的自定义函数:
def f(x):
s = x.dropna()
return s.index.repeat(s).tolist()
df['Products'] = df.iloc[:, 2:].apply(f, axis=1)
print (df)
Day Id Banana Apple Products
0 2020-01-01 1 1.0 1 [Banana, Apple]
1 2020-01-02 1 NaN 2 [Apple, Apple]
2 2020-01-03 2 2.0 2 [Banana, Banana, Apple, Apple]
我有一个这样的数据框:
Day Id Banana Apple
2020-01-01 1 1 1
2020-01-02 1 NaN 2
2020-01-03 2 2 2
如何将其转换为:
Day Id Banana Apple Products
2020-01-01 1 1 1 [Banana, Apple]
2020-01-02 1 NaN 2 [Apple, Apple]
2020-01-03 2 2 2 [Banana, Banana, Apple, Apple]
Select 所有没有前 2 的列按 DataFrame.iloc
, then reshape by DataFrame.stack
, repeat MultiIndex
by Index.repeat
的位置并聚合 list
s:
s = df.iloc[:, 2:].stack()
df['Products'] = s[s.index.repeat(s)].reset_index().groupby(['level_0'])['level_1'].agg(list)
print (df)
Day Id Banana Apple Products
0 2020-01-01 1 1.0 1 [Banana, Apple]
1 2020-01-02 1 NaN 2 [Apple, Apple]
2 2020-01-03 2 2.0 2 [Banana, Banana, Apple, Apple]
或者使用带有重复 columns
名称且没有缺失值的自定义函数:
def f(x):
s = x.dropna()
return s.index.repeat(s).tolist()
df['Products'] = df.iloc[:, 2:].apply(f, axis=1)
print (df)
Day Id Banana Apple Products
0 2020-01-01 1 1.0 1 [Banana, Apple]
1 2020-01-02 1 NaN 2 [Apple, Apple]
2 2020-01-03 2 2.0 2 [Banana, Banana, Apple, Apple]