查找两个值并填写

Lookup on two values and fill in

我有一个数据框(我的真实数据框有 50 000 行和 34 列):

df = pd.DataFrame({
    'NAME': ['APPLE COMPANY A', 'BANANA COMPANY B', 'ORANGE COMPANY C', 'APPLE COMPANY A'],
    'INVESTMENTS': ['OIL LTD', 'GOLD LTD', 'GAS LTD', 'GAS LTD'],
    'STOCKS' : [100, 200, 300, 400],
    'OIL LTD': [0, 0, 0, 0],
    'GOLD LTD': [0, 0, 0, 0],
    'GAS LTD': [0, 0, 0, 0],
    })

               NAME INVESTMENTS  STOCKS  OIL LTD  GOLD LTD  GAS LTD
0   APPLE COMPANY A     OIL LTD     100        0         0        0
1  BANANA COMPANY B    GOLD LTD     200        0         0        0
2  ORANGE COMPANY C     GAS LTD     300        0         0        0
3   APPLE COMPANY A     GAS LTD     400        0         0        0

如何根据 NAME 中的值和列名查找列 STOCKS 中的值?例如,对于列 OIL LTD 中的第一个值,它在列 NAME 中搜索 APPLE COMPANY A 并在列 [=18= 中搜索 OIL LTD(基于同名列) ],它给出了值 100 并且可以在下面看到。因此,它搜索的值来自列名 OIL LTDGOLD LTDGAS LTD 等,基于 NAMEINVESTMENTS.[=26 的值=]

我希望输出如下所示:

               NAME INVESTMENTS  STOCKS  OIL LTD  GOLD LTD  GAS LTD
0   APPLE COMPANY A     OIL LTD     100      100         0      400
1  BANANA COMPANY B    GOLD LTD     200        0       200        0
2  ORANGE COMPANY C     GAS LTD     300        0         0      300
3   APPLE COMPANY A     GAS LTD     400        0         0      400

如果我想查找一个值,我通常会使用 pd.merge(),但不确定这是否适用于两个值。它适用于 Excel,但每列 运行 函数需要 15 分钟,效率不高。

如果最后一列仅由 0 填充,解决方案是 pivot,然后删除列并最后加入:

df1 = df.pivot('NAME','INVESTMENTS','STOCKS').fillna(0).astype(int)
df = df.drop(df1.columns, axis=1).join(df1, on='NAME')
print (df)
               NAME INVESTMENTS  STOCKS  GAS LTD  GOLD LTD  OIL LTD
0   APPLE COMPANY A     OIL LTD     100      400         0      100
1  BANANA COMPANY B    GOLD LTD     200        0       200        0
2  ORANGE COMPANY C     GAS LTD     300      300         0        0
3   APPLE COMPANY A     GAS LTD     400      400         0      100

如果需要像原始 DataFrame 中一样的列顺序:

cols = df.columns.drop(['NAME','INVESTMENTS','STOCKS'])
df1 = df.pivot('NAME','INVESTMENTS','STOCKS').fillna(0).astype(int)[cols]
df = df.drop(df1.columns, axis=1).join(df1, on='NAME')
print (df)
               NAME INVESTMENTS  STOCKS  OIL LTD  GOLD LTD  GAS LTD
0   APPLE COMPANY A     OIL LTD     100      100         0      400
1  BANANA COMPANY B    GOLD LTD     200        0       200        0
2  ORANGE COMPANY C     GAS LTD     300        0         0      300
3   APPLE COMPANY A     GAS LTD     400      100         0      400