pandas 将 asof 与多个匹配项合并

pandas merge asof with more than one match

我想pandas merge_asof加入以下数据帧

ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')]], columns = ['date_left'])
rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
                   [pd.to_datetime('2010-01-01'), 6]], columns = ['date_right', 'variable'])

这是 ll:

    date_left
0   2010-01-01
1   2010-02-01

和 rr:

    date_right  variable
0   2010-01-01  12
1   2010-01-01  6

以下

pd.merge_asof(ll, rr, left_on = 'date_left', right_on='date_right', direction='backward')

让我明白

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-02-01  2010-01-01  6

但我希望(并且期望,因为它是左连接)

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-01-01  2010-01-01  12
2   2010-02-01  2010-01-01  6
3   2010-02-01  2010-01-01  12

我怎样才能达到这个结果?

---- 编辑----: Sammywemmy 给出了使用看门人 conditional_join 的解决方案。这适用于我上面发布的简约示例。但是,我仍然想要 merge_asof 的其余功能。我的意思是:

ll = pd.DataFrame([[pd.to_datetime('2010-01-01')], [pd.to_datetime('2010-02-01')],[pd.to_datetime('2010-03-01')], [pd.to_datetime('2010-04-01')]], columns = ['date_left'])

会=

    date_left
0   2010-01-01
1   2010-02-01
2   2010-03-01
3   2010-04-01

rr = pd.DataFrame([[pd.to_datetime('2010-01-01'), 12],
                   [pd.to_datetime('2010-01-01'), 6],
                   [pd.to_datetime('2010-03-01'), 3]], columns = ['date_right', 'variable'])

rr =

date_right  variable
0   2010-01-01  12
1   2010-01-01  6
2   2010-03-01  3

那我想:

    date_left   date_right  variable
0   2010-01-01  2010-01-01  6
1   2010-01-01  2010-01-01  12
2   2010-02-01  2010-01-01  6
3   2010-02-01  2010-01-01  12
4   2010-03-01  2010-03-01  3
5   2010-04-01  2010-03-01  3

而条件连接会给我:

    date_left   date_right  variable
0   2010-01-01  2010-01-01  12
1   2010-01-01  2010-01-01  6
2   2010-02-01  2010-01-01  12
3   2010-02-01  2010-01-01  6
4   2010-03-01  2010-01-01  12
5   2010-03-01  2010-01-01  6
6   2010-03-01  2010-03-01  3
7   2010-04-01  2010-01-01  12
8   2010-04-01  2010-01-01  6
9   2010-04-01  2010-03-01  3

谢谢

pd.merge_asof,后跟一个 merge 就足够了:

(pd.merge_asof(ll, rr.date_right, left_on='date_left', right_on = 'date_right')
   .merge(rr, on='date_right', how = 'left')
)
   date_left date_right  variable
0 2010-01-01 2010-01-01        12
1 2010-01-01 2010-01-01         6
2 2010-02-01 2010-01-01        12
3 2010-02-01 2010-01-01         6

这也适用于更新后的示例问题:

(pd.merge_asof(ll, rr.date_right, left_on='date_left', right_on = 'date_right')
   .merge(rr, on='date_right', how = 'left')
)

   date_left date_right  variable
0 2010-01-01 2010-01-01        12
1 2010-01-01 2010-01-01         6
2 2010-02-01 2010-01-01        12
3 2010-02-01 2010-01-01         6
4 2010-03-01 2010-03-01         3
5 2010-04-01 2010-03-01         3