61

Intro to forecasting with FB's Prophet (python)

 5 years ago
source link: https://www.tuicool.com/articles/hit/6z2Mry2
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Introduction to forecasting with FB Prophet

Prophet is a forecasting tool developed by Facebook to quickly forecast time series data, available in R and Python. In this post I'll walk you through a quick example of how to forecast U.S. candy sales using Prophet and Python.

First, we'll read in the data, which shows the 'industrial production index', or INDPRO ( detail here ) for candy in the U.S. You can download the data in our github repository here .

#import libraries

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from fbprophet import Prophet

#read in and preview our data

df = pd.read_csv('./datasets/candy_production.csv')

df.head()

observation_date IPG3113N 0 1972-01-01 85.6945 1 1972-02-01 71.8200 2 1972-03-01 66.0229 3 1972-04-01 64.5645 4 1972-05-01 65.0100

Great, so as we can see, we now have data showing U.S. candy production (normalized against 2012=100 in this dataset), which we can use as an input for our time series forecasting model with Prophet. Next, we'll need to do a little bit of cleaning to prep the data for Prophet.

#rename date column ds, value column y per Prohet specs

df.rename(columns={'observation_date': 'ds'}, inplace=True)

df.rename(columns={'IPG3113N': 'y'}, inplace=True)

#ensure our ds value is truly datetime

df['ds'] = pd.to_datetime(df['ds'])

#filtering here on >=1995, just to pull the last ~20 years of production information

start_date = '01-01-1995'

mask = (df['ds'] > start_date)

df = df.loc[mask]

Next, we can load our dataframe, df, into Prophet, and set a window for # of days we want it to predict

#initialize Prophet

m = Prophet()

#point towards dataframe

m.fit(df)

#set future prediction window of 2 years

future = m.make_future_dataframe(periods=730)

#preview our data -- note that Prophet is only showing future dates (not values), as we need to call the prediction method still

future.tail()

ds 996 2019-07-28 997 2019-07-29 998 2019-07-30 999 2019-07-31 1000 2019-08-01

Next, we can call the predict method, which will assign each row in our 'future' dataframe a predicted value, which it names yhat. Additionally, it will show lower/upper bounds of uncertainty, called yhat_lower and yhat_upper.

forecast = m.predict(future)

forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()

ds yhat yhat_lower yhat_upper 996 2019-07-28 117.796140 111.591576 123.519210 997 2019-07-29 117.016641 110.999394 122.908126 998 2019-07-30 116.001765 109.887603 122.016393 999 2019-07-31 114.757009 108.375085 121.465374 1000 2019-08-01 113.293294 107.234872 119.295134

We now have an initial time series forecast using Prophet, we can plot the results as shown below:

fig1 = m.plot(forecast)

fig1 pic1.png

fig2 = m.plot_components(forecast)

fig2 pic2.png

Here we can see at a high-level production is expected to continue it's upward trend over the next couple of years. Additionally, we can see the spikes in production for the various U.S. holidays (Valentine's Day, Halloween, Christmas). The Jupyter notebook used in this exercise can be found in our github repository here .

For more in depth reading, would recommend checking out the docs , as they're pretty easy to understand with additional detail/examples.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK