根据两个条件加入或合并或重塑数据框

Question

我有两个数据帧 df 和 df1 我想合并或加入。

import pandas as pd

df = pd.DataFrame(columns=['lt1', 'lt2','lt3','lt4','lt5','lt6'])
df['date'] = pd.date_range('2016-1-1', periods=5, freq='D')
df
   lt1  lt2  lt3  lt4  lt5  lt6       date
0  NaN  NaN  NaN  NaN  NaN  NaN 2016-01-01
1  NaN  NaN  NaN  NaN  NaN  NaN 2016-01-02
2  NaN  NaN  NaN  NaN  NaN  NaN 2016-01-03
3  NaN  NaN  NaN  NaN  NaN  NaN 2016-01-04
4  NaN  NaN  NaN  NaN  NaN  NaN 2016-01-05

df1 = pd.DataFrame({'location': ['lt1','lt3', 'lt6', 'lt1','lt2', 'lt3'], \
                   'date': ['2016-01-1', '2016-01-02','2016-01-1','2016-01-03','2016-01-5','2016-01-4'], \
                   'counts': ['2', '1','1','1', '3','1']})

df1.date = pd.to_datetime(df1.date)
df1
  counts       date location
0      2 2016-01-01      lt1
1      1 2016-01-02      lt3
2      1 2016-01-01      lt6
3      2 2016-01-03      lt1
4      3 2016-01-05      lt2
5      1 2016-01-04      lt3

我想根据从 df1 到 df 的位置输入计数值。合并将基于 date 列，但要添加的值将来自 df2.counts 列，这些值将正确分配到 df 中的相应位置名称列。 df 中的列名包含 df1.location 列中的所有名称。

仅按日期合并很容易，但由于它不是真正的直接合并，它更像是重塑或连接。关于如何获得以下 df 作为输出的任何建议：

df
        date  lt1  lt2  lt3  lt4  lt5  lt6
0 2016-01-01    2    0    0    0    0    1
1 2016-02-01    0    0    1    0    0    0
2 2016-03-01    1    0    0    0    0    0
3 2016-04-01    0    0    1    0    0    0
4 2016-05-01    0    3    0    0    0    0

Answer 1

这是使用 pivot_table and combine_first 的一种方法：

m=df1.pivot_table(index='date',columns='location',values='counts',aggfunc='sum')
final=df.set_index('date').combine_first(m).fillna(0).reset_index()

或者只是：

(df.set_index('date').combine_first(df1.pivot('date','location','counts'))
                                             .fillna(0).reset_index())

        date lt1 lt2 lt3  lt4  lt5 lt6
0 2016-01-01   2   0   0    0    0   1
1 2016-01-02   0   0   1    0    0   0
2 2016-01-03   1   0   0    0    0   0
3 2016-01-04   0   0   1    0    0   0
4 2016-01-05   0   3   0    0    0   0

根据两个条件加入或合并或重塑数据框

join or merge or reshape dataframe based on two conditions

merge

join

pandas

python-3.5