Time-series Forecasting

Introduction

A time-series forecast predicts future values of time-series based on their historical data. A data source for a time-series forecasting must have at least one column whose format is yyyy-MM-dd or standard DateTime values yyyy-MM-ddTHH:mm:ss.

Parameters

  • Date time: The temporal column whose values are datetimes. The column value needs to have the DATETIME type.
  • Columns: The columns used for the forecast. The unselected columns would not be used in the model training. This analytics does not support the categorical data yet, please do not include them. We are working hard to support them.
  • Prediction length: How many time steps to predict. It has to be shorter than one-fourth of historical data.
  • Number of trials: The number of models to be trained. The best model based on validation data will be chosen to generate the final forecast. More trials usually result in a better forecast but take longer to train.
  • Filters (optional): Set conditions on columns to filter on the original dataset. If selected, only a subset of the original data would be used in the analytics.

Case Study

Imagine we are a bike rental shop owner. We have our bike demand data over two years and we would like to predict the future demand.

An example of the dataset could be:

temp hum windspeed casual registered cnt dteday
0.344167 0.805833 0.160446 331 654 985 2011-01-01
0.363478 0.696087 0.248539 131 670 801 2011-01-02
0.196364 0.437273 0.248309 120 1229 1349 2011-01-03
0.2 0.590435 0.160296 108 1454 1562 2011-01-04
0.226957 0.436957 0.1869 82 1518 1600 2011-01-05
0.204348 0.518261 0.0895652 88 1518 1606 2011-01-06
0.196522 0.498696 0.168726 148 1362 1510 2011-01-07
0.165 0.535833 0.266804 68 891 959 2011-01-08
0.138333 0.434167 0.36195 54 768 822 2011-01-09

The data collected daily temperature, humidity, wind speed, daily casual customers, daily registered customers, daily rental number (cnt) and date (dteday).

We set our parameters as

_images/setup7.png

Review Result

The result view contains Multi lines chart tab, Multi y axis tab, Performance tab, Table lines chart tab and Table tab.

Multi lines chart

The Multi lines chart tab would plot all selected Columns values in a single Y axis along the given Data time as the X axis. The plot would also include additional 50 days of data because we set prediction length as 50. Since we are only willing to understand our future demand, including casual, registered and cnt. You may notice for the prediction period, we not only display the forecasting number, but also the possible range for the value. It would be displayed as a confidence band using the lighter colour of the line.

_images/multi_lines_chart.png

You can click or double click the legend to (only) display/hide a specific line on the graph to get a clearer view of the prediction result.

Multi Y axis chart

Like the Multi lines chart tab plots all selected Columns values, Multi y axis plots all values in multiple y axes without scaling them. One can use the legend tool on the top right corner to hide/display variables.

Every axis scale and offset can be adjusted independently by pressing the mouse on the axis, or pressing at the bottom of the axis and dragging the mouse.

_images/interactive_multi_y_axis.gif

Performance

The Performance tab is for demonstrating how well the model is performing, by using the end period of the data as the validation data. Actable AI calculates several scores to evaluate the performance.

  • MAPE: Mean Absolute Percentage Error. It is calculated as the average absolute precent error for each time period minus actual values divided by actual values.
  • MASE: Mean absolute scaled error. It is calculated as the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast.
  • MSE: Mean Squared Error. It is calculated as the average squared difference between the estimated values and the actual value.
  • sMAPE: Symmetric Mean Absolute Percentage Error. Overcome the asymmetric issue that exists in MAPE, where it puts a heavier penalty on negative errors than on positive errors. But it is unstable when both the true value and the forecast are very close to zero.

Table lines chart

The Table lines chart tab display the value plotted in the Multi lines chart as a table.

_images/table_lines_chart.png

The Table tab display the original dataset.