尝试在 Pyspark 中实施 Holt-Winters 指数平滑时出错
Error when trying to implement Holt-Winters Exponential Smoothing in Pyspark
我正在尝试对我的数据集 FinalModel 执行 Holt-Winters 指数平滑,该模型具有 Date
作为索引和 Crimecount
列以及其他列。我只想预测 CrimeCount
列,但出现以下错误:
ValueError: Buffer dtype mismatch, expected 'double' but got 'long long'
我的代码:
df = FinalModel.copy()
train, test = FinalModel.iloc[:85, 18], df.iloc[85:, 18]
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
df.index.freq = 'MS'
model = ExponentialSmoothing(train.astype(np.int64), seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=test.index[0], end=test.index[-1])
plt.plot(train.index, train, label='Train')
plt.plot(test.index, test, label='Test')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
错误表明输入值应为 doubles
,但收到的却是 long
类型。强制输入值是 numpy 浮点数而不是 numpy 整数就可以了:
df = FinalModel.copy()
train, test = FinalModel.iloc[:85, 18], df.iloc[85:, 18]
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
df.index.freq = 'MS'
model = ExponentialSmoothing(train.astype('<f8'), seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=test.index[0], end=test.index[-1])
plt.plot(train.index, train, label='Train')
plt.plot(test.index, test, label='Test')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
通常大多数来自 statsmodels
和 sklearn
的统计模型都假设输入值是浮点数。大多数这些方法会自动为您进行转换,但 ExponentialSmoothing 似乎不会。尽管如此,为了保持一致性,将输入值转换为浮点数是一个好习惯。
我正在尝试对我的数据集 FinalModel 执行 Holt-Winters 指数平滑,该模型具有 Date
作为索引和 Crimecount
列以及其他列。我只想预测 CrimeCount
列,但出现以下错误:
ValueError: Buffer dtype mismatch, expected 'double' but got 'long long'
我的代码:
df = FinalModel.copy()
train, test = FinalModel.iloc[:85, 18], df.iloc[85:, 18]
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
df.index.freq = 'MS'
model = ExponentialSmoothing(train.astype(np.int64), seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=test.index[0], end=test.index[-1])
plt.plot(train.index, train, label='Train')
plt.plot(test.index, test, label='Test')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
错误表明输入值应为 doubles
,但收到的却是 long
类型。强制输入值是 numpy 浮点数而不是 numpy 整数就可以了:
df = FinalModel.copy()
train, test = FinalModel.iloc[:85, 18], df.iloc[85:, 18]
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
df.index.freq = 'MS'
model = ExponentialSmoothing(train.astype('<f8'), seasonal='mul', seasonal_periods=12).fit()
pred = model.predict(start=test.index[0], end=test.index[-1])
plt.plot(train.index, train, label='Train')
plt.plot(test.index, test, label='Test')
plt.plot(pred.index, pred, label='Holt-Winters')
plt.legend(loc='best')
通常大多数来自 statsmodels
和 sklearn
的统计模型都假设输入值是浮点数。大多数这些方法会自动为您进行转换,但 ExponentialSmoothing 似乎不会。尽管如此,为了保持一致性,将输入值转换为浮点数是一个好习惯。