KeyError: 'One or more row labels was not found' >> lookup python pandas

KeyError: 'One or more row labels was not found' >> lookup python pandas

我的脚本有问题,会生成 KeyError:'One or more row labels was not found'

df1 = pd.DataFrame({'Destcode' : ['A','B','C','D','E','F','G'],
                     'City A' : ['Available','Available','Available','Available','Not Available','Not Available','Available'],
                     'City B' : ['Not Available','Available','Not Available','Available','Not Available','Not Available','Available'],
                     'City C' : ['Available','Available','Not Available','Available','Not Available','Available','Available']})

df2 = pd.DataFrame({'Destcode' : ['C','F','G','D','E'],
                     'Origin' : ['City A','City C','City A','City B','City D']})

所以,我有 2 个 DataFrame。

数据帧 1

df1
   Destcode       City A         City B         City C
0     A       Available      Not Available   Available
1     B       Available      Available       Available
2     C       Available      Not Available   Not Available
3     D       Available      Available       Available
4     E       Not Available  Not Available   Not Available
5     F       Not Available  Not Available   Available
6     G       Available      Available       Available

数据帧 2

df2
    Destcode    Origin
0      C        City A
1      F        City C
2      G        City A
3      D        City B
4      E        City D

我运行这个脚本

df2['Cek Available'] = df1.set_index('Destcode').lookup(df2.Destcode, df2.Origin)

我得到一个错误 enter image description here

我知道问题出在 Origin City D,它不在 DataFrame df1 中。

如果您要查找的数据不存在,您如何做到这一点,它将 return 值“不可用”? 请帮我解决这个问题

   Destcode   Origin    Cek Available
0      C      City A          ?
1      F      City C          ?
2      G      City A          ?
3      D      City B          ?
4      E      City D          ?

您可以使用 try/exceptfor 循环:

cek = []
df1 = df1.set_index('Destcode')
for c, o in zip(df2['Destcode'], df2['Origin']):
    try:
        x = df1.loc[c,o]
    except:
        x = np.nan

    cek.append(x)

df2['Cek Available'] = cek

输出:

  Destcode  Origin Cek Available
0        C  City A     Available
1        F  City C     Available
2        G  City A     Available
3        D  City B     Available
4        E  City D           NaN

tidy data format 中的数据通常更易于使用。融化df1并加入。

df3 = df1.melt(id_vars="Destcode", var_name="Origin")
df2.merge(df3, on=["Destcode", "Origin"], how="left")

结果

  Destcode  Origin      value
0        C  City A  Available
1        F  City C  Available
2        G  City A  Available
3        D  City B  Available
4        E  City D        NaN

我会使用 getmelt:

x = df1.melt(id_vars='Destcode', var_name='Origin').set_index(['Destcode', 'Origin']).squeeze()
df2['Cek Available'] = df2.apply(lambda y: x.get(tuple(y), np.nan), axis=1)

输出:

>>> df2
  Destcode  Origin Cek Available
0        C  City A     Available
1        F  City C     Available
2        G  City A     Available
3        D  City B     Available
4        E  City D           NaN
>>>