Plotly 中的趋势线 Python

Trendline in Plotly Python

我正在使用 Plotly 在 Python 中生成一个图,它以时间序列显示数据。我正在使用 SQLite 数据库中的以下数据(如下面的 dateslines):

[(u'2015-12-08 00:00:00',), (u'2015-11-06 00:00:00',), (u'2015-11-06 00:00:00',), (u'2015-10-07 00:00:00',), (u'2015-10-06 00:00:00',), (u'2015-10-06 00:00:00',), (u'2015-09-17 00:00:00',), (u'2015-09-17 00:00:00',), (u'2015-09-17 00:00:00',), (u'2015-09-17 00:00:00',), (u'2015-09-16 00:00:00',), (u'2015-09-15 00:00:00',), (u'2015-09-15 00:00:00',), (u'2015-09-15 00:00:00',), (u'2015-08-30 00:00:00',), (u'2015-08-22 00:00:00',), (u'2015-08-22 00:00:00',), (u'2015-08-17 00:00:00',), (u'2015-08-09 00:00:00',), (u'2015-08-09 00:00:00',), (u'2015-08-08 00:00:00',), (u'2015-08-07 00:00:00',), (u'2015-07-28 00:00:00',), (u'2015-07-26 00:00:00',), (u'2015-07-22 00:00:00',), (u'2015-07-22 00:00:00',), (u'2015-07-22 00:00:00',), (u'2015-07-13 00:00:00',), (u'2015-07-13 00:00:00',), (u'2015-07-13 00:00:00',), (u'2015-07-13 00:00:00',), (u'2015-07-09 00:00:00',), (u'2015-07-09 00:00:00',), (u'2015-07-09 00:00:00',), (u'2015-07-09 00:00:00',), (u'2015-06-28 00:00:00',), (u'2015-06-28 00:00:00',), (u'2015-06-28 00:00:00',), (u'2015-06-16 00:00:00',), (u'2015-06-14 00:00:00',), (u'2015-06-14 00:00:00',), (u'2015-06-14 00:00:00',), (u'2015-06-04 00:00:00',), (u'2015-04-09 00:00:00',), (u'2015-03-31 00:00:00',), (u'2015-03-09 00:00:00',), (u'2015-03-09 00:00:00',), (u'2015-03-09 00:00:00',), (u'2015-03-09 00:00:00',), (u'2015-03-09 00:00:00',), (u'2015-03-09 00:00:00',)]
[(18,), (24,), (17,), (22,), (16,), (18,), (24,), (20,), (16,), (14,), (21,), (21,), (24,), (15,), (23,), (22,), (22,), (20,), (24,), (20,), (20,), (20,), (22,), (21,), (21,), (23,), (23,), (17,), (25,), (20,), (25,), (25,), (25,), (26,), (26,), (19,), (17,), (16,), (16,), (14,), (17,), (17,), (13,), (27,), (19,), (19,), (12,), (17,), (20,), (12,), (21,)]

有些数据是重叠的(同一天有多个实例),但这对于拟合线可能无关紧要。我的代码如下所示:

import sqlite3
import plotly.plotly as py
from plotly.graph_objs import *
import numpy as np

db = sqlite3.connect("Applications.db")
cursor = db.cursor()

cursor.execute('SELECT date FROM applications ORDER BY date(date) DESC')
dates = cursor.fetchall()
cursor.execute('SELECT lines FROM applications ORDER BY date(date) DESC')
lines = cursor.fetchall()

trace0 = Scatter(
    x=dates,
    y=lines,
    name='Amount of lines',
    mode='markers'
)
trace1 = Scatter(
    x=dates,
    y=lines,
    name='Fit',
    mode='markers'
)
data = Data([trace0, trace1])

py.iplot(data, filename = 'date-axes')

如何根据此数据制作 trace1 拟合趋势线?即,显示数据发展的平滑表示。

Per Plotly 支持:"Unfortunately fits aren't exposed through the API right now. We're working on add the fit GUI to the IPython interface though and eventually the API"(2015 年 9 月 25 日)。

在大量阅读和谷歌搜索之后,我发现最简单的方法是通过 Matplotlib、Numbpy 和 SciPy。稍微清理一下数据后,以下代码起作用了:

import plotly.plotly as py
import plotly.tools as tls
from plotly.graph_objs import *
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as dates

def line(x, a, b):
    return a * x + b

popt, pcov = curve_fit(line, trend_dates.ravel(), trend_lines.ravel())

fig1 = plt.figure(figsize=(8,6))
plt.plot_date(new_x, trend_lines, 'o', label='Lines')
z = np.polyfit(new_x, trend_lines, 1)
p = np.poly1d(z)
plt.plot(new_x, p(new_x), '-', label='Fit')
plt.title('Lines per day')
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
py.iplot(fig)

基本上 new_x 是 Matplotlib 预期的日期,trend_lines 是问题中的常规数据。这不是一个完整的例子,因为之前有大量的数据清理和库导入,但它展示了一种通过 Matplotlib、Numbpy 和 SciPy 获取 Plotly 图作为输出的方法。 =13=]