A deep dive into time series forecasting techniques

Are you tired of making guesses about the future of your business or investments? Do you want to make data-driven decisions with confidence? Time series forecasting might just be the solution you’re looking for.

Madhav

Dec 19, 2023

3 mins

Chapters

What is Time Series Data?Types of Time Series Data Methods to check Stationarity Methods for determining PDQ Values (optimal hyper parameters)Evaluating a forecasting model

By analysing historical data, identifying patterns, and using statistical models, time series forecasting can help you make accurate predictions about future trends and behaviour. In this blog, we’ll dive into the world of time series forecasting, exploring its benefits, best practices, and common techniques. Whether you’re a seasoned data scientist or just getting started, this blog will equip you with the knowledge you need to start forecasting like a pro. So, let’s get started! Finally, we will conclude with a detailed guide on how to Forecast time series data using the ARIMA model.

What is Time Series Data?

Time series data is a set of data points collected in chronological order at regular intervals. It is standard and widely used in many different fields such as economics, business, and weather forecasting. To be able to understand a time series, we must first understand the main components it consists of.

The components of a time series are the seasonal trend, cyclic trend, upward and downward trends, serial correlation, and non-linear trends. The seasonal trend is a repetitive pattern that occurs over shorter terms such as days, weeks, or months. The cyclic trend is a pattern that occurs periodically over a longer term such as years or decades. An upward or downward trend shows a gradual upward or downward change. Serial correlation shows a relationship between the adjacent data points. Lastly, non-linear trends are complex patterns that are difficult to predict.

By understanding the components of time series data, data analysts and scientists can more accurately forecast time series data. This allows businesses to make more informed decisions based on their data, which leads to better outcomes.

Types of Time Series Data

Time series data can broadly be categorized into two types: stationary and non-stationary.Stationary time series data are those where the distribution of data points remains unchanged over time. This type of data usually shows a consistent mean value and a consistent variance. Non-stationary time series data, on the other hand, have a trend in their data points, meaning that their distribution is constantly shifting over time. This type of data is typically seen in financial markets, stock prices, and macroeconomic indicators.It is important to note that forecasting time series data requires knowledge of the type of data as different forecasting techniques are suitable for different types of time series data. For example, exponential smoothing is a suitable forecasting technique for stationary time series data, while ARIMA is generally suited for non-stationary time series data.

Stationarity

We now move onto stationarity in time series forecasting, which refers to when certain elements of the data points remain constant over time. Stationarity is determined by looking at the mean, variance, autocovariance, and autocorrelation of the data.

Non — Stationarity

Seasonal effects, trends, and other patterns that depend on the time index are visible in observations from a non-stationary time series. The mean and variance of summary statistics fluctuate with time, which causes a drift in the notions a model might attempt to capture. By identifying and eliminating trends and seasonal effects, traditional time series analysis and forecasting approaches aim to make non-stationary time series data stationary.It is important to check whether the time series data is stationary before attempting to forecast as it can affect the accuracy of the predicted values. If data is not stationary, it can be made stationary by taking into account the trend and seasonality factors. This can be done through transformation techniques such as differencing or using exponential smoothing.Once the time series data is stationary, it is much easier to forecast and make accurate predictions. In addition, using traditional time series forecasting techniques like ARIMA or Holt-Winters methods is more reliable when dealing with stationary data.

Methods to check Stationarity

There are many methods to check whether a time series (direct observations, residuals, otherwise) is stationary or non-stationary.

Examine Plots

You can review a time series plot of your data and visually check if there are any obvious trends or seasonality.

Summary Statistics

Reviewing summary statistics is a quick and dirty way to determine if your time series is non-stationary. You can compare the mean and variance of each group by dividing your time series into two (or more) partitions. The time series is probably non-stationary if they differ and the difference is statistically significant.Running the above instances reveals mean and SD values for each group that are once more comparable but not identical.We might conclude that the time series is stationary based solely on these data, but This is a quick and dirty method that may be easily fooled.

Statistical Tests

We shall employ a statistical test to specifically comment on whether a univariate time series is stationary or not.

1) Augmented Dickey-Fuller (ADF) Test:

2) Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test:

3) Phillips-Perron (PP) Test:

Methods for determining PDQ Values (optimal hyper parameters)

The pdq values in time series forecasting refer to the orders of the Autoregressive (AR), Integrated (I), and Moving Average (MA) components of the ARIMA model. The pdq values are important hyperparameters that determine the structure of the ARIMA model and can have a significant impact on the accuracy of the forecasting results. Here are some ways to find the pdq values in time series forecasting: 1. ACF and PACF plots 2. Information criteria( AIC and BIC values) 3. Grid search

Evaluating a forecasting model

Some of the commonly used statistical tests for time series forecasting are:

Mean Absolute Error (MAE): This is a measure of the average absolute difference between the actual values and the forecasted values. A lower value of MAE indicates better accuracy of the forecast.

Mean Squared Error (MSE): This is a measure of the average squared difference between the actual values and the forecasted values. A lower value of MSE indicates better accuracy of the forecast.

Root Mean Squared Error (RMSE): This is the square root of the MSE. It provides a measure of the average magnitude of the forecast error. A lower value of RMSE indicates better accuracy of the forecast.

Mean Absolute Percentage Error (MAPE): This is a measure of the average percentage difference between the actual values and the forecasted values. A lower value of MAPE indicates better accuracy of the forecast.

Make informed decisions with data

Get started