Python 多种产品的 Prophet 需求预测:将所有预测保存到单个数据框中

Python Prophet Demand Forecasting for multiple products: saving all forecasts into single data frame

我有以下代码为 3 种产品(A、B 和 C)创建时间序列预测。它将所有产品的预测保存到数据框中,forecast_df。但是,我无法弄清楚如何将产品名称与产品预测一起放在行中。它为产品名称创建一个列,但随后它为除 C 之外的所有产品放置 C,它称为 NaN。我怎样才能让它正确地输入产品名称?

import pandas as pd
import numpy as np
from datetime import date, timedelta, datetime
from fbprophet import Prophet
from pmdarima.model_selection import train_test_split

class color:
    BOLD = '3[1m'
    END = '3[0m'

df = pd.DataFrame({"ds": ['2020-4-26','2020-5-3','2020-5-10','2020-5-17','2020-5-24','2020-5- 
                  31','2020-6-7','2020-6-14','2020-6-21','2020-6-28'], 
               "A": [164,127,157,127,170,322,133,176,233,257], "B": 
                    [306,405,267,265,306,265,325,297,310,271], 
               "C": [23,41,75,24,48,31,51,26,41,43]})

df['ds'] = pd.to_datetime(df['ds'])

start_date = min(df['ds'])
end_date = max(df['ds'])
print(start_date, end_date)

train_len = int(week * 0.99)
print(train_len)

forecast_df = pd.DataFrame()

for col in df.columns[1:]:
    print('\n', color.BOLD + 'ITEM #', col + color.END)
    dfx = df[['ds', col]]
    dfx = dfx.rename({col: 'y'}, axis=1)

    #Train-Test-Split
    y_train, y_test = train_test_split(dfx, train_size=train_len)

    #Fit Model
    m = Prophet()
    m.fit(y_train)

    #Calculate forecast
    print('\n', color.BOLD + 'CALCULATE FORECAST' + color.END)
    future = m.make_future_dataframe(periods=5, freq='W')
    forecast = m.predict(future)
    print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail())

    #save forecast to dataframe
    for col in df.columns[1:]:
        forecast_df['Product'] = col
        #forecast_df = pd.concat((forecast_df, forecast[['ds', 'yhat','yhat_upper','yhat_lower']]))
        forecast_df = forecast_df.append(forecast[['ds', 'yhat','yhat_upper','yhat_lower']])

print('\n', color.BOLD + 'FORECAST DATAFRAME' + color.END)
print(forecast_df)

这是输出:

 ITEM # A

 CALCULATE FORECAST
 
           ds        yhat  yhat_lower  yhat_upper

8  2020-06-21  207.886980  137.260222  277.544817

9  2020-06-28  215.891658  144.606357  287.614507

10 2020-07-05  223.896335  152.586583  293.597136

11 2020-07-12  231.901013  154.032672  304.878146

12 2020-07-19  239.905690  168.971432  311.714361




 ITEM # B

 CALCULATE FORECAST

           ds        yhat  yhat_lower  yhat_upper

8  2020-06-21  281.352232  228.094939  330.983604

9  2020-06-28  276.200199  221.798843  329.703188

10 2020-07-05  271.048166  216.131704  322.966893

11 2020-07-12  265.896133  214.037533  317.526028

12 2020-07-19  260.744100  209.923955  312.366033



 ITEM # C

 CALCULATE FORECAST

           ds       yhat  yhat_lower  yhat_upper

8  2020-06-21  38.015576   16.388462   58.753207

9  2020-06-28  37.578068   16.483414   60.281396

10 2020-07-05  37.140560   16.269026   58.592980

11 2020-07-12  36.703052   16.351184   58.890988

12 2020-07-19  36.265544   14.038544   56.053327






 FORECAST DATAFRAME

   Product         ds        yhat  yhat_upper  yhat_lower Product

0        C 2020-04-26  143.849560  215.773593   71.588250

1        C 2020-05-03  151.854238  223.179640   77.229544

2        C 2020-05-10  159.858915  231.388203   87.914720

3        C 2020-05-17  167.863593  243.705433   97.468648

4        C 2020-05-24  175.868270  247.448227  103.620476

..     ...        ...         ...         ...         ...

8      NaN 2020-06-21   38.015576   57.919222   17.838312

9      NaN 2020-06-28   37.578068   57.668158   15.740910

10     NaN 2020-07-05   37.140560   57.990138   16.508734

11     NaN 2020-07-12   36.703052   59.519944   17.110290

12     NaN 2020-07-19   36.265544   56.410552   15.599971


[117 rows x 5 columns]

这样的事情可能会奏效。 您还应该提及如何在名为 Product 的列中设置三个值(A、B C 的预测)。

for colname in ['A', 'B', 'C']:
   dd = df.loc[:,['ds', colname]]
   dd.columns=['ds', 'y']
   m = Prophet()
   m.fit(dd)
   future = m.make_future_dataframe(periods=5, freq='W')
   forecast = m.predict(future)
   df['Forecast_'+colname] = forecast['yhat']

谢谢,卡特琳娜!

我不得不稍微修改一下,但这是对我有用的答案:

#将预测保存到数据框

预测['Item'] = col

#concat 所有结果在一个数据帧中

forecast_df = pd.concat([forecast_df, 预测[['Item','ds', 'Quantity_Ordered', 'yhat' , 'yhat_lower', 'yhat_upper']]], ignore_index=True)