如何 select 在 pandas 数据框中记录不存在的条件

How to select records with not exists condition in pandas dataframe

我有两个数据框如下。我想将数据选择 SQL 查询重写为 pandas 其中包含不存在的条件

SQL

Select ORDER_NUM, DRIVER  FROM DF
                          WHERE
                          1=1
                        AND NOT EXISTS   
 (
        SELECT 1 FROM
                 order_addition oa
                 WHERE
                  oa.Flag_Value = 'Y'
            AND   df.ORDER_NUM  = oa.ORDER_NUM)

示例数据

order_addition.head(10)

ORDER_NUM   Flag_Value
22574536    Y
32459745    Y
15642314    Y
12478965    N
25845673    N
36789156    N

df.head(10)

ORDER_NUM   REGION  DRIVER
22574536    WEST    Ravi
32459745    WEST    David
15642314    SOUTH   Rahul
12478965    NORTH   David
25845673    SOUTH   Mani
36789156    SOUTH   Tim

如何在 pandas 中轻松完成此操作。

IIUC,你可以在 df1merge 的值等于 Y,然后找到 nans:

result = df2.merge(df1[df1["Flag_Value"].eq("Y")],how="left",on="ORDER_NUM")

print (result[result["Flag_Value"].isnull()])

   ORDER_NUM REGION DRIVER Flag_Value
3   12478965  NORTH  David        NaN
4   25845673  SOUTH   Mani        NaN
5   36789156  SOUTH    Tim        NaN

如果您的 ORDER_NUM 是独一无二的,甚至更简单:

print (df2.loc[~df2["ORDER_NUM"].isin(df1.loc[df1["Flag_Value"].eq("Y"),"ORDER_NUM"])])

   ORDER_NUM REGION DRIVER
3   12478965  NORTH  David
4   25845673  SOUTH   Mani
5   36789156  SOUTH    Tim