将数据实时附加到一个空的 pandas DataFrame

Append data in realtime to an empty pandas DataFrame

我想实时向一个空的 DataFrame 添加一些数据:

import pandas as pd
import time

df = pd.DataFrame(columns=['time', 'price'])   # this is a simple example
                                               # but in my code, I have more 
                                               # columns: 'volume', etc.
for i in range(5):                             # here it lasts one day in my real use case
    time.sleep(2)
    t = pd.datetime.now()
    df[t] = 5 + i
    # here I need to have access to the latest updates of df

print df

输出为:

Empty DataFrame  
Columns: [time, price, 2015-12-27 01:55:29.812000, 2015-12-27 01:55:31.812000, 2015-12-27 01:55:33.812000, 2015-12-27 01:55:35.812000, 2015-12-27 01:55:37.812000]  
Index: []

而我想要的是:

time                                price
2015-12-27 01:55:29.812000          5
2015-12-27 01:55:31.812000          6
2015-12-27 01:55:33.812000          7
...

如何将数据附加到这样的 DataFrame?

您正在使用 df[t] 将 DataFrame 索引到列 t 中。我想您想改为按行索引它。

虽然从外观上看,似乎系列可能更适合,因为您正在按时间索引更新。

import pandas as pd
import time

series = pd.Series()

for i in range(5):
    time.sleep(2)
    t = pd.datetime.now()
    series[t] = 5 + i

print series


import pandas as pd
import time

如果需要数据帧,可以使用 df.ix[row_index]:

附加
df = pd.DataFrame(columns = ['col1', 'col2'])

for i in range(5):
    time.sleep(2)
    t = pd.datetime.now() # Generate row index
    df.ix[t] = {'col1': 5 + i, 'col2': 20 + i}


print df

考虑使用 pandas' append() 函数将循环数据列表迁移到数据框:

df = pd.DataFrame(columns=['time', 'price'])

for i in range(5):
    time.sleep(2)
    t = pd.datetime.now()
    df = df.append(pd.DataFrame({'time': [t],
                                 'price': [5 + i]}))
print df