Pandas 合并没有给出预期的日期时间输出

Pandas merge not giving expected output with datetime

我有两个数据帧:第一个数据帧 "fgblquotef" 样本是:

                         DateTimesy    VWPfgbmy
59       2014-09-05 06:00:24.033000  127.687514
60       2014-09-05 06:00:24.436000  127.687933
61       2014-09-05 06:00:24.597000  127.687746
62       2014-09-05 06:00:24.891000  127.687752
63       2014-09-05 06:00:25.178000  127.687730
64       2014-09-05 06:00:25.227000  127.687741
65       2014-09-05 06:00:26.035000  127.687651
66       2014-09-05 06:00:26.667000  127.689970
71       2014-09-05 06:00:26.677000  127.692642
72       2014-09-05 06:00:26.681000  127.692571
73       2014-09-05 06:00:26.688000  127.696051
75       2014-09-05 06:00:26.700000  127.696051
76       2014-09-05 06:00:26.702000  127.695850
79       2014-09-05 06:00:27.216000  127.687548
80       2014-09-05 06:00:27.910000  127.687512
81       2014-09-05 06:00:28.208000  127.687524
82       2014-09-05 06:00:28.289000  127.687436
83       2014-09-05 06:00:28.717000  127.687436
85       2014-09-05 06:00:28.998000  127.686910
87       2014-09-05 06:00:29.035000  127.687043
88       2014-09-05 06:00:29.062000  127.687534
89       2014-09-05 06:00:29.099000  127.687059
90       2014-09-05 06:00:29.327000  127.686843
91       2014-09-05 06:00:29.386000  127.686811
92       2014-09-05 06:00:29.505000  127.686984
93       2014-09-05 06:00:29.571000  127.686931
94       2014-09-05 06:00:29.602000  127.686989
96       2014-09-05 06:00:29.958000  127.686771
97       2014-09-05 06:00:29.960000  127.686759
98       2014-09-05 06:00:29.962000  127.686673

和第二个"df":

                        DateTimesx                 DateTimesy  
2       2014-09-05 06:00:23.596000 2014-09-05 06:00:24.596000  
3       2014-09-05 06:00:23.644000 2014-09-05 06:00:24.644000  
4       2014-09-05 06:00:23.694000 2014-09-05 06:00:24.694000  
5       2014-09-05 06:00:23.744000 2014-09-05 06:00:24.744000  
6       2014-09-05 06:00:23.794000 2014-09-05 06:00:24.794000  
7       2014-09-05 06:00:23.844000 2014-09-05 06:00:24.844000  
8       2014-09-05 06:00:23.894000 2014-09-05 06:00:24.894000  
9       2014-09-05 06:00:24.044000 2014-09-05 06:00:25.044000  
10      2014-09-05 06:00:24.294000 2014-09-05 06:00:25.294000  
11      2014-09-05 06:00:24.394000 2014-09-05 06:00:25.394000  
12      2014-09-05 06:00:24.444000 2014-09-05 06:00:25.444000  
13      2014-09-05 06:00:24.544000 2014-09-05 06:00:25.544000  
14      2014-09-05 06:00:24.694000 2014-09-05 06:00:25.694000  
15      2014-09-05 06:00:24.794000 2014-09-05 06:00:25.794000  
16      2014-09-05 06:00:24.844000 2014-09-05 06:00:25.844000  
17      2014-09-05 06:00:25.294000 2014-09-05 06:00:26.294000  
18      2014-09-05 06:00:25.394000 2014-09-05 06:00:26.394000  
19      2014-09-05 06:00:25.694000 2014-09-05 06:00:26.694000  
20      2014-09-05 06:00:25.794000 2014-09-05 06:00:26.794000  
21      2014-09-05 06:00:26.044000 2014-09-05 06:00:27.044000  
22      2014-09-05 06:00:26.294000 2014-09-05 06:00:27.294000  
23      2014-09-05 06:00:26.544000 2014-09-05 06:00:27.544000  
24      2014-09-05 06:00:26.694000 2014-09-05 06:00:27.694000  
25      2014-09-05 06:00:28.344000 2014-09-05 06:00:29.344000  
26      2014-09-05 06:00:29.044000 2014-09-05 06:00:30.044000  
27      2014-09-05 06:00:29.094000 2014-09-05 06:00:30.094000  
28      2014-09-05 06:00:29.144000 2014-09-05 06:00:30.144000  
29      2014-09-05 06:00:29.394000 2014-09-05 06:00:30.394000  
30      2014-09-05 06:00:29.744000 2014-09-05 06:00:30.744000  
31      2014-09-05 06:00:29.894000 2014-09-05 06:00:30.894000

第二个数据框 "df" 具有使用以下方法创建的列 df["DateTimesy"]:

td = pd.to_timedelta(1, unit= "s")
df["DateTimesy"] = df["DateTimesx"] + td

然后我合并使用:

df2 = pd.merge(df, fgbmquotef, on = "DateTimesy", how = "outer")

但是我得到的结果是:

                        DateTimesx                 DateTimesy    VWPfgbmy  
0       2014-09-05 06:00:23.596000 2014-09-05 06:00:24.596000         NaN  
1       2014-09-05 06:00:23.644000 2014-09-05 06:00:24.644000         NaN  
2       2014-09-05 06:00:23.694000 2014-09-05 06:00:24.694000         NaN  
3       2014-09-05 06:00:23.744000 2014-09-05 06:00:24.744000         NaN  
4       2014-09-05 06:00:23.794000 2014-09-05 06:00:24.794000         NaN  
5       2014-09-05 06:00:23.844000 2014-09-05 06:00:24.844000         NaN  
6       2014-09-05 06:00:23.894000 2014-09-05 06:00:24.894000         NaN  
7       2014-09-05 06:00:24.044000 2014-09-05 06:00:25.044000         NaN  
8       2014-09-05 06:00:24.294000 2014-09-05 06:00:25.294000         NaN  
9       2014-09-05 06:00:24.394000 2014-09-05 06:00:25.394000         NaN  
10      2014-09-05 06:00:24.444000 2014-09-05 06:00:25.444000         NaN  
11      2014-09-05 06:00:24.544000 2014-09-05 06:00:25.544000         NaN  
12      2014-09-05 06:00:24.694000 2014-09-05 06:00:25.694000         NaN  
13      2014-09-05 06:00:24.794000 2014-09-05 06:00:25.794000         NaN  
14      2014-09-05 06:00:24.844000 2014-09-05 06:00:25.844000         NaN  
15      2014-09-05 06:00:25.294000 2014-09-05 06:00:26.294000         NaN  
16      2014-09-05 06:00:25.394000 2014-09-05 06:00:26.394000         NaN  
17      2014-09-05 06:00:25.694000 2014-09-05 06:00:26.694000         NaN  
18      2014-09-05 06:00:25.794000 2014-09-05 06:00:26.794000         NaN  
19      2014-09-05 06:00:26.044000 2014-09-05 06:00:27.044000         NaN  
20      2014-09-05 06:00:26.294000 2014-09-05 06:00:27.294000         NaN  
21      2014-09-05 06:00:26.544000 2014-09-05 06:00:27.544000         NaN  
22      2014-09-05 06:00:26.694000 2014-09-05 06:00:27.694000         NaN  
23      2014-09-05 06:00:28.344000 2014-09-05 06:00:29.344000         NaN  
24      2014-09-05 06:00:29.044000 2014-09-05 06:00:30.044000         NaN  
25      2014-09-05 06:00:29.094000 2014-09-05 06:00:30.094000         NaN  
26      2014-09-05 06:00:29.144000 2014-09-05 06:00:30.144000         NaN  
27      2014-09-05 06:00:29.394000 2014-09-05 06:00:30.394000         NaN  
28      2014-09-05 06:00:29.744000 2014-09-05 06:00:30.744000         NaN  
29      2014-09-05 06:00:29.894000 2014-09-05 06:00:30.894000         NaN 

这是错误的,因为那里也应该有 "fgblquotef" 个条目,而不仅仅是 "df" 个条目。任何人都可以解释这里发生了什么以及我在哪里犯了错误吗?

也许:

df2 = pd.merge(df, fgbmquotef, left_on = "DateTimesy",right_on = "DateTimesy", how = "outer") #虽然你不应该这样做。

尝试:

df2 = pd.merge(df.set_index("DateTimesy"), fgbmquotef.set_index("DateTimesy"), left_index=True, right_index=True, how = "outer") 



df2 = pd.merge(df.set_index("DateTimesy", drop=False), fgbmquotef.set_index("DateTimesy", drop=False), left_index=True, right_index=True, how = "outer", suffixes = ('_df', '_fgbmquotef')) 

或不带后缀:

df2 = pd.merge(df.set_index("DateTimesy", drop=False), fgbmquotef.set_index("DateTimesy", drop=False), left_index=True, right_index=True, how = "outer")

最后尝试连接功能:http://pandas.pydata.org/pandas-docs/stable/merging.html#concatenating-objects