如何根据动态列值 select 行?
How to select rows based on dynamic column value?
首先,我有以下数据框 df_A
sector
SALES
EBIT
DPS
IT
xxxx
yyyy
zzz
ENERGY
xxxx
yyyy
zzz
FINANCE
xxxx
yyyy
zzz
CONSUMER
xxxx
yyyy
zzz
和另一个数据框df_B
NAME
sector
SALES
EBIT
DPS
AAPL
IT
xxxx
yyyy
zzz
BP
ENERGY
xxxx
yyyy
zzz
TGT
CONSUMER
xxxx
yyyy
zzz
MSFT
IT
xxxx
yyyy
zzz
HSBC
FINANCE
xxxx
yyyy
zzz
GOOG
IT
xxxx
yyyy
zzz
WMT
CONSUMER
xxxx
yyyy
zzz
META
IT
xxxx
yyyy
zzz
CVX
ENERGY
xxxx
yyyy
zzz
JPM
FINANCE
xxxx
yyyy
zzz
MCD
CONSUMER
xxxx
yyyy
zzz
等等
这只是一个例子,我有比这更大的数据框
我想做的是通过区分 df_B 的扇区来创建新的数据帧;
其中新创建的数据帧遵循 df_A["sectors"]
的顺序
最后将它们合并在一起,希望是水平格式
所以最后我希望我的输出看起来像
NAME
sector
SALES
EBIT
DPS
NAME
sector
SALES
EBIT
DPS
NAME
sector
SALES
EBIT
DPS
NAME
sector
SALES
EBIT
DPS
AAPL
IT
xxxx
yyyy
zzz
BP
ENERGY
xxxx
yyyy
zzz
HSBC
FINANCE
xxxx
yyyy
zzz
WMT
CONSUMER
xxxx
yyyy
zzz
MSFT
IT
xxxx
yyyy
zzz
CVX
ENERGY
xxxx
yyyy
zzz
JPM
FINANCE
xxxx
yyyy
zzz
TGT
CONSUMER
xxxx
yyyy
zzz
GOOG
IT
xxxx
yyyy
zzz
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
MCD
CONSUMER
xxxx
yyyy
zzz
META
IT
xxxx
yyyy
zzz
如果上面的横排不行,竖排table也行
我是 python 的菜鸟,我尝试使用 for 循环、字典、loc/iloc,但不知何故 none 我的代码工作正常...
非常感谢任何帮助
创建 N 个数据帧,每个扇区一个,然后将它们连接成一个:
out = pd.concat([pd.DataFrame(df_B[df_B['sector'] == sector].to_dict('records'))
for sector in df_A['sector'].unique().tolist()], axis=1)
print(out)
# Output
NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS
0 AAPL IT xxxx yyyy zzz BP ENERGY xxxx yyyy zzz HSBC FINANCE xxxx yyyy zzz TGT CONSUMER xxxx yyyy zzz
1 MSFT IT xxxx yyyy zzz CVX ENERGY xxxx yyyy zzz JPM FINANCE xxxx yyyy zzz WMT CONSUMER xxxx yyyy zzz
2 GOOG IT xxxx yyyy zzz NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN MCD CONSUMER xxxx yyyy zzz
3 META IT xxxx yyyy zzz NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
首先,我有以下数据框 df_A
sector | SALES | EBIT | DPS |
---|---|---|---|
IT | xxxx | yyyy | zzz |
ENERGY | xxxx | yyyy | zzz |
FINANCE | xxxx | yyyy | zzz |
CONSUMER | xxxx | yyyy | zzz |
和另一个数据框df_B
NAME | sector | SALES | EBIT | DPS |
---|---|---|---|---|
AAPL | IT | xxxx | yyyy | zzz |
BP | ENERGY | xxxx | yyyy | zzz |
TGT | CONSUMER | xxxx | yyyy | zzz |
MSFT | IT | xxxx | yyyy | zzz |
HSBC | FINANCE | xxxx | yyyy | zzz |
GOOG | IT | xxxx | yyyy | zzz |
WMT | CONSUMER | xxxx | yyyy | zzz |
META | IT | xxxx | yyyy | zzz |
CVX | ENERGY | xxxx | yyyy | zzz |
JPM | FINANCE | xxxx | yyyy | zzz |
MCD | CONSUMER | xxxx | yyyy | zzz |
等等
这只是一个例子,我有比这更大的数据框
我想做的是通过区分 df_B 的扇区来创建新的数据帧;
其中新创建的数据帧遵循 df_A["sectors"]
的顺序最后将它们合并在一起,希望是水平格式
所以最后我希望我的输出看起来像
NAME | sector | SALES | EBIT | DPS | NAME | sector | SALES | EBIT | DPS | NAME | sector | SALES | EBIT | DPS | NAME | sector | SALES | EBIT | DPS |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AAPL | IT | xxxx | yyyy | zzz | BP | ENERGY | xxxx | yyyy | zzz | HSBC | FINANCE | xxxx | yyyy | zzz | WMT | CONSUMER | xxxx | yyyy | zzz |
MSFT | IT | xxxx | yyyy | zzz | CVX | ENERGY | xxxx | yyyy | zzz | JPM | FINANCE | xxxx | yyyy | zzz | TGT | CONSUMER | xxxx | yyyy | zzz |
GOOG | IT | xxxx | yyyy | zzz | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | MCD | CONSUMER | xxxx | yyyy | zzz |
META | IT | xxxx | yyyy | zzz |
如果上面的横排不行,竖排table也行
我是 python 的菜鸟,我尝试使用 for 循环、字典、loc/iloc,但不知何故 none 我的代码工作正常...
非常感谢任何帮助
创建 N 个数据帧,每个扇区一个,然后将它们连接成一个:
out = pd.concat([pd.DataFrame(df_B[df_B['sector'] == sector].to_dict('records'))
for sector in df_A['sector'].unique().tolist()], axis=1)
print(out)
# Output
NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS NAME sector SALES EBIT DPS
0 AAPL IT xxxx yyyy zzz BP ENERGY xxxx yyyy zzz HSBC FINANCE xxxx yyyy zzz TGT CONSUMER xxxx yyyy zzz
1 MSFT IT xxxx yyyy zzz CVX ENERGY xxxx yyyy zzz JPM FINANCE xxxx yyyy zzz WMT CONSUMER xxxx yyyy zzz
2 GOOG IT xxxx yyyy zzz NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN MCD CONSUMER xxxx yyyy zzz
3 META IT xxxx yyyy zzz NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN