Autoregressive Integrated Moving Average (ARIMA) Prediction Model

What Is an Autoregressive Integrated Moving Average (ARIMA)?

An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends. 

A statistical model is autoregressive if it predicts future values based on past values. For example, an ARIMA model might seek to predict a stock's future prices based on its past performance or forecast a company's earnings based on past periods.

Key Takeaways

  • Autoregressive integrated moving average (ARIMA) models predict future values based on past values.
  • ARIMA makes use of lagged moving averages to smooth time series data.
  • They are widely used in technical analysis to forecast future security prices.
  • Autoregressive models implicitly assume that the future will resemble the past.
  • Therefore, they can prove inaccurate under certain market conditions, such as financial crises or periods of rapid technological change.

Understanding Autoregressive Integrated Moving Average (ARIMA)

An autoregressive integrated moving average model is a form of regression analysis that gauges the strength of one dependent variable relative to other changing variables. The model's goal is to predict future securities or financial market moves by examining the differences between values in the series instead of through actual values.

An ARIMA model can be understood by outlining each of its components as follows:

  • Autoregression (AR): refers to a model that shows a changing variable that regresses on its own lagged, or prior, values.
  • Integrated (I): represents the differencing of raw observations to allow the time series to become stationary (i.e., data values are replaced by the difference between the data values and the previous values).
  • Moving average (MA):  incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations.

ARIMA Parameters

Each component in ARIMA functions as a parameter with a standard notation. For ARIMA models, a standard notation would be ARIMA with p, d, and q, where integer values substitute for the parameters to indicate the type of ARIMA model used. The parameters can be defined as:

  • p: the number of lag observations in the model, also known as the lag order.
  • d: the number of times the raw observations are differenced; also known as the degree of differencing.
  • q: the size of the moving average window, also known as the order of the moving average.

For example, a linear regression model includes the number and type of terms. A value of zero (0), which can be used as a parameter, would mean that particular component should not be used in the model. This way, the ARIMA model can be constructed to perform the function of an ARMA model, or even simple AR, I, or MA models.

Because ARIMA models are complicated and work best on very large data sets, computer algorithms and machine learning techniques are used to compute them.

ARIMA and Stationary Data

In an autoregressive integrated moving average model, the data are differenced in order to make it stationary. A model that shows stationarity is one that shows there is constancy to the data over time. Most economic and market data show trends, so the purpose of differencing is to remove any trends or seasonal structures. 

Seasonality, or when data show regular and predictable patterns that repeat over a calendar year, could negatively affect the regression model. If a trend appears and stationarity is not evident, many of the computations throughout the process cannot be made and produce the intended results.

A one-time shock will affect subsequent values of an ARIMA model infinitely into the future. Therefore, the legacy of the financial crisis lives on in today’s autoregressive models.

How to Build an ARIMA Model

To begin building an ARIMA model for an investment, you download as much of the price data as you can. Once you've identified the trends for the data, you identify the lowest order of differencing (d) by observing the autocorrelations. If the lag-1 autocorrelation is zero or negative, the series is already differenced. You may need to difference the series more if the lag-1 is higher than zero.

Next, determine the order of regression (p) and order of moving average (q) by comparing autocorrelations and partial autocorrelations. Once you have the information you need, you can choose the model you'll use.

Pros and Cons of ARIMA

ARIMA models have strong points and are good at forecasting based on past circumstances, but there are more reasons to be cautious when using ARIMA. In stark contrast to investing disclaimers that state "past performance is not an indicator of future performance...," ARIMA models assume that past values have some residual effect on current or future values and use data from the past to forecast future events.

The following table lists other ARIMA traits that demonstrate good and bad characteristics.

Pros
  • Good for short-term forecasting

  • Only needs historical data

  • Models non-stationary data

Cons
  • Not built for long-term forecasting

  • Poor at predicting turning points

  • Computationally expensive

  • Parameters are subjective

What Is ARIMA Used for?

ARIMA is a method for forecasting or predicting future outcomes based on a historical time series. It is based on the statistical concept of serial correlation, where past data points influence future data points.

What Are the Differences Between Autoregressive and Moving Average Models?

ARIMA combines autoregressive features with those of moving averages. An AR(1) autoregressive process, for instance, is one in which the current value is based on the immediately preceding value, while an AR(2) process is one in which the current value is based on the previous two values. A moving average is a calculation used to analyze data points by creating a series of averages of different subsets of the full data set to smooth out the influence of outliers. As a result of this combination of techniques, ARIMA models can take into account trends, cycles, seasonality, and other non-static types of data when making forecasts.

How Does ARIMA Forecasting Work?

ARIMA forecasting is achieved by plugging in time series data for the variable of interest. Statistical software will identify the appropriate number of lags or amount of differencing to be applied to the data and check for stationarity. It will then output the results, which are often interpreted similarly to that of a multiple linear regression model.

The Bottom Line

The ARIMA model is used as a forecasting tool to predict how something will act in the future based on past performance. It is used in technical analysis to predict an asset's future performance.

ARIMA modeling is generally inadequate for long-term forecastings, such as more than six months ahead, because it uses past data and parameters that are influenced by human thinking. For this reason, it is best used with other technical analysis tools to get a clearer picture of an asset's performance.

Article Sources
Investopedia requires writers to use primary sources to support their work. These include white papers, government data, original reporting, and interviews with industry experts. We also reference original research from other reputable publishers where appropriate. You can learn more about the standards we follow in producing accurate, unbiased content in our editorial policy.
  1. Duke University. "Identifying the Order of Differencing in an ARIMA Model."

Take the Next Step to Invest
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.
Service
Name
Description