Pandas：合并两个时间序列，得到这两个时间段重叠的时间段内的平均值

Question

我有两个 pandas 数据帧如下：

ts1
Out[50]: 
                     soil_moisture_ids41  
date_time                                 
2007-01-07 05:00:00               0.1830  
2007-01-07 06:00:00               0.1825  
2007-01-07 07:00:00               0.1825  
2007-01-07 08:00:00               0.1825  
2007-01-07 09:00:00               0.1825  
...                                 ...  
2017-10-10 20:00:00               0.0650  
2017-10-10 21:00:00               0.0650  
2017-10-10 22:00:00               0.0650  
2017-10-10 23:00:00               0.0650  
2017-10-11 00:00:00               0.0650  

[94316 rows x 3 columns]

另一个是

ts2
Out[51]: 
                     soil_moisture_ids42  
date_time                                                        
2016-07-20 00:00:00                0.147  
2016-07-20 01:00:00                0.148  
2016-07-20 02:00:00                0.149  
2016-07-20 03:00:00                0.150  
2016-07-20 04:00:00                0.152  
...                                 ...  
2019-12-31 19:00:00                0.216 
2019-12-31 20:00:00                0.216 
2019-12-31 21:00:00                0.215 
2019-12-31 22:00:00                0.215 
2019-12-31 23:00:00                0.215 

[30240 rows x 3 columns]

你可以看到，从2007-01-07到2016-07-19，只有ts1有数据点。从 2016-07-20 到 2017-10-11 有一些重叠的时间序列。现在我想合并这两个数据框。在重叠期间，我想获得 ts1 和 ts2 的平均值。在非重叠期间，（2007-01-07到2016-07-19和2017-10-12到2019-12-31），每个时间戳的值设置为ts1的值或 ts2。那我该怎么做呢？

谢谢！

Answer 1

将 concat 与聚合 mean 结合使用，如果只有一个值获得相同的输出，则多个值获得 mean。最后 DatatimeIndex 也被排序：

s = pd.concat([ts1, ts2]).groupby(level=0).mean()

Pandas：合并两个时间序列，得到这两个时间段重叠的时间段内的平均值

Pandas: merge two time series and get the mean values during the period when these two have overlapped time period

time

merge

join

dataframe

pandas