Yes, our favorite fbprophet is back with multivariate forecasting.

Rupak (Bob) Roy - II
4 min readJul 10, 2022

--

The next best alternative to Multivariate LSTM Time Series Forecasting.

Hi there, we are back again with a new topic across the globe. Keeping it short and simple. This time i will demonstrate to you how to perform multivariate time series forecasting using our lightning-fast fbprophet approach.

Yes, you heard it right, we can now perform Multivariate Forecasting with fbprophet.

Let’s get started Shall we?

Here is the dataset: https://www.kaggle.com/datasets/rupakroy/stock-trading-data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from fbprophet import Prophet
df = pd.read_csv("stock_trading_data.csv")
#we will take only 5 columns
df = df.iloc[:,:5]
#fixed the datetime column
df["Date"] = pd.to_datetime(df["Date"])
stock trading dateset

Now we will divide the dataset into 2 parts Train and Test 70:30 ratio

#Divide the data into train and test
test_size =np.round(df.shape[0] *30/100).astype(int)
df1_train = df.iloc[test_size:,:]
df1_test = df.iloc[:test_size,:]
df1_train.dtypes

#let’s plot the graph and try to visualize for all columns based on date

plt.figure(figsize=(10,8))
figure,axes = plt.subplots(nrows=2,ncols=2)
axes[0,0].plot(df1_train["Date"],df1_train["Open"],label="Open")
axes[0,1].plot(df1_train["Date"],df1_train["High"],label="High")
axes[1,0].plot(df1_train["Date"],df1_train["Low"],label="Low")
axes[1,1].plot(df1_train["Date"],df1_train["Close"],label="Close")
plt.legend()
Multivariate plots ‘Open’,’High’,’Low’,’Close’ respectively
Multivariate plots ‘Open’,’High’,’Low’,’Close’ respectively

as you are aware that we need the dataset to be in ‘ds’ i.e. Date and ‘y’ i.e. Target, ds,y format.

#prepare the dateset for FBprophet
df1_train.rename(columns={"Open":'y',"Date":'ds'},inplace=True)
df1_train.head(5)

Now to time to apply the Magic!

model = Prophet(interval_width=0.9)
model.add_regressor('High',standardize=False)
model.add_regressor('Low',standardize=False)
model.add_regressor('Close', standardize=False)
model.fit(df1_train)

Done!… we all just need to add the multivariate columns as add_regressor, that's it.

If you remember the formula of the regressor

y = mx+ c, where m = beta coefficients, x = x1, c = intercept

And if you remember how time series work, using Autoregressor

i.e. X(t+1) = c +m1*(t-1) + m1*(t-2)……m2*(t-1)+m2*(t-2) so on and so forth.

Autoregressor itself is multivariate in the sense it computes t+1 with its lag version of itself and this is how time series works! now adding results of each different multiple variables like ‘High’, ‘Low’, ‘Close’ we can compute multivariate time series

#To view the model parameters 
model.params
Multivariate model parameters
Multivariate model parameters

We will replicate the training dataset without ‘y’ target column and see how the model is able to forecast using its internal regressor engine.

#understanding the model fit---------df1_train_2 = df1_train[["ds","High","Low","Close"]] 
#we will be predicting 'y' i.e."Open"
forecast1_train = model.predict(df1_train_2)
forecast1_train = forecast1_train[['ds','yhat']]
df_model_fit = pd.concat((forecast1_train['yhat'],df1_train.reset_index()),axis=1)

Merge both the dataset predicted/fitted vs actual and plot the ‘y’ vs ‘yhat’

df_model_fit
df_model_fit
#Visualize it 
plt.figure(figsize=(8,6))
plt.plot(df_model_fit['ds'],df_model_fit['y'],color='red',label='actual')
plt.plot(df_model_fit['ds'],df_model_fit['yhat'],color='blue',label='Forecasted')
plt.legend()
actual vs forecasted

Looks great!

Let's try the same with test/unseen dataset

#create an test dataframe
df1_test.rename(columns={"Open":'y',"Date":'ds'},inplace=True)
df1_test.head(5)
df1_test_2 = df1_test[["ds","High","Low","Close"]] #we will be predicting 'y'i.e."Open"
df1_test_2
forecast1_test=model.predict(df1_test_2)
forecast1_test = forecast1_test[['ds','yhat']]
df_testdata_fit = pd.concat((forecast1_test['yhat'],df1_test.reset_index()),axis=1).reset_index()#Visualize it
plt.figure(figsize=(8,6))
plt.plot(df_testdata_fit['ds'],df_testdata_fit['y'],color='red',label='actual')
plt.plot(df_testdata_fit['ds'],df_testdata_fit['yhat'],color='blue',label='Forecasted')
plt.legend()
actual vs forecasted

Seems the results are improving!

But wait this is done with default fbProphet settings without proper parameter tunning and another way to improve is the check for any gaps(Na’s) in the timeline if so then reframe the dataset using

df.resample(“30min”,on = “DateColumnName”).Duration.mean().reset_index()

Repo Link to resampling: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

Even though taking a proper time frame is necessary for time series analysis, there are tons of ways to achieve proper accuracy.

The objective of the article to demonstrate how to apply multivariate forecasting in faster way than our traditional LSTM approach. I hope you don't mind not tunning the model.

#make futuredate Approach--------------------
future = model.make_future_dataframe(periods=377)
future.tail()
future_prediction = model.predict(future)

We are also aware of ‘make_future_dataframe’ approach in fbprophet but will not work it will throw an error of missing other independent variables because the ‘make_future_dataframe’ will not have the independent variables’. Thus….

model.predict(future)
model.predict(future)

Putting all of the pieces together.

Template: Multivariate Facebook Prophet Time Series Forecasting

I hope you find this article useful for your machine learning and statistical use cases. Likewise, i will try to bring new ways across with the motto “curiosity leads to innovation” :)

Check out the kaggle implementation: https://www.kaggle.com/rupakroy/multivariate-timeseries-fbprophet

Github Repo: https://github.com/rupak-roy/Multivariate-Facebook-Prophet-Time-Series-Forecasting-Template

Walkthrough of multivariate forecasting using fbprophet
Bob Rupak Roy
Bob Rupak Roy

Thanks again, for your time, if you enjoyed this short article there are tons of topics in advanced analytics, data science, and machine learning available in my medium repo. https://medium.com/@bobrupakroy

Some of my alternative internet presences are Facebook, Instagram, Udemy, Blogger, Issuu, Slideshare, Scribd and more.

Also available on Quora @ https://www.quora.com/profile/Rupak-Bob-Roy

Let me know if you need anything. Talk Soon.

--

--

Rupak (Bob) Roy - II
Rupak (Bob) Roy - II

Written by Rupak (Bob) Roy - II

Things i write about frequently on Medium: Data Science, Machine Learning, Deep Learning, NLP and many other random topics of interest. ~ Let’s stay connected!

Responses (2)