Experimenting with Neural Prophet

Paul Bruffett
4 min readDec 16, 2021

A library for time series forecasting.

The notebook for this can be found on github.

I’ll be using data on NO2 emissions in several Chinese cities in order to build some models and experiment with Neural Prophet. This library rebuilds Facebook Prophet using PyTorch, making it more extensible and allowing new approaches (AR-Net) to be integrated into the package.

The Data

We’ll download and prepare the data into two data frames, one for training a basic model using data from one city; another for predicting NO2 emissions across multiple cities.

Download and prepare data

Now our dataframe looks like;

sample of data from one city

For our first model, we’ll train a model very much like the standard Prophet, which is resilient to missing values, so data preparation is minimal, we must consolidate the 4 time columns into one (measurements are hourly);

We also simply renamed NO2 to ‘y’, now on to modeling.

Training a Model

We’ll start by training a model that is very similar to classical prophet. We get a model that includes parameters for various time horizons and includes change-points and long-term trendlines.

We accept most of the defaults for this model and let Prophet split the data;

This model is reasonably accurate but misses the larger swings in emissions on each side of the trend line;

last 3 months of our prediction

1-step ahead forecast with Auto-Regression

Let’s take advantage of some of Neural Prophet’s features by including time lags, these lags allow the model to look back at a given number of past readings when making the next prediction. We will predict the next reading using the last 24 (our readings are hourly, so this is a lag of one day)measurements;

This model’s training and validation converge much more closely than our traditional Prophet example, our Smooth L1 Loss is also almost 1/10th the value of our traditional model (0.0045 vs 0.433, respectively).

Using a Neural Net

Using a neural net we can model non-linear relationships in our data. Holding parameters the same, we can add 4 layers of 16 neurons in order to define a neural network that will use the lagged inputs in order to make predictions;

The results for this are very similar to the previous model. This model does have a more interesting and interpretable AR weight parameter view;

This view shows how importantly the model weights each of the 18 previous inputs used as the time lag for predictions.

24-Step Ahead

With Prophet, we can predict an arbitrary number of time steps ahead, but it uses the last prediction to make each next sequence, so using 1-step ahead to predict long time segments in advance is inefficient; with this model, let’s look 24 steps ahead per inference and use 3 days of lagged data to predict;

This model converges with the lowest validation smoothed L1 loss at 0.309.

We can see this model fits much more closely on our test dataset than the original Prophet-esque version.

Predicting

Like the original Prophet, the package will create a future data frame, this automatically includes the number of periods into the future specified by the model, in this case 24 (hours). The model can then predict against this;

Enabling raw returns an array of predictions, one for each step in the future, disabling decompose prevents residuals and other information being returned, giving us only the prediction at each time step.

Make_future_dataframe also appropriately captures the number of historical rows required for the prediction’s time lag.

Additional Regressors

Lastly, we can add regressors; these are additional features Prophet can use when forecasting. The key consideration with utilizing additional regressors, especially for long-range forecasting, is these values must be available in the dataset when infereincing. If weather data is used to make NO2 predictions for the next week, we must utilize the next week’s hourly weather forecast as input. If this forecast is inaccurate it may reduce the overall accuracy of our model.

Training & Saving

Finally, let’s train a model for each of the cities in the dataset and save the trained models.

We prepare the dataset, configure one model then fit it on each dataset, saving it after training.

This done, we can load a specific model and predict the next 24 hours of NO2 emissions for a given city;

--

--

Paul Bruffett

Enterprise Architect specializing in data and analytics.