Arima Model Optimal Selection for Time Series Forecasting

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This paper proposes a rapid, flexible method for optimal ARIMA model selection in univariate time series forecasting by identifying significant lags via autocorrelation of a detrended series. The approach minimizes both RMSE and MaxAE to achieve highly accurate forecasts, selecting the best model or combining models when a single minimum is not found.

Abstract
Translate article icon Translate Article Star icon

Abstract A fast-and-flexible method of ARIMA model optimal selection is suggested for univariate time series forecasting. The method allows obtaining as-highly-accurate-as-possible forecasts automatically. It is based on effectively finding lags by the autocorrelation function of a detrended time series, where the best-fitting polynomial trend is subtracted from the time series. The forecasting quality criteria are the root-mean-square error (RMSE) and the maximum absolute error (MaxAE) allowing to register information about the average inaccuracy and worst outlier. Thus, the ARIMA model optimal selection is performed by simultaneously minimizing RMSE and Max-AE, whereupon the minimum defines the best model. Otherwise, if the minimum does not exist, a combination of minimal-RMSE and minimal-MaxAE ARIMA models is used.

Similar Papers
  • Research Article
  • 10.30837/bi.2021.1(96).01
Maximum-versus-mean absolute error in selecting criteria of time series forecasting quality
  • Jul 2, 2021
  • Bionics of Intelligence
  • Vadim Romanuke

In time series forecasting, a commonly accepted criterion of the forecasting quality is the root-mean-square error (RMSE). Sometimes only RMSE is used. In other cases, another measure of forecasting accuracy is used along with RMSE. It is the mean absolute error (MAE). Although RMSE and MAE are the common criteria of time series forecasting quality, they both register information about averaged errors. However, averaging may remove information about volatility, which is typical for time series, in a few points (outliers) or narrow intervals. Information about outliers in time series forecasts (with respect to test data) can be registered by the maximum absolute error (MaxAE). The MaxAE criterion does not have any relation to averaging. It registers information about the worst outlier instead. Therefore, the goal is to ascertain the best criteria of time series forecasting quality, wherein the RMSE criterion is always present. First, 12 types of benchmark time series are defined to test and select criteria. The time series is of 168 points, whereas the last third of the series is forecasted. After having generated 200 times series for each of those 12 types, ARIMA forecasts are made at 56 points of every series. All the 2400 RMSEs are sorted in ascending order, whereupon the respective MAEs and MaxAEs are re-arranged as well. The interrelation between the RMSE and MAE/MaxAE is studied by their intercorrelation function. RMSEs and MaxAEs are “more different” than RMSEs and MAEs, because the correlation between the RMSE and MAE is stronger. Consequently, the MAE criterion is useless as it just nearly replicates information about the forecasting quality from the RMSE criterion. Inasmuch as the MaxAE criterion can import additional information about the forecasting quality, the best criteria are RMSE and MaxAE.

  • Research Article
  • Cite Count Icon 8
  • 10.9734/ajaees/2019/v30i230106
Time Series Analysis and Forecasting of Oilseeds Production in India: Using Autoregressive Integrated Moving Average and Group Method of Data Handling – Neural Network
  • Feb 27, 2019
  • Asian Journal of Agricultural Extension, Economics & Sociology
  • Debasis Mithiya + 2 more

Oilseeds have been the backbone of India’s agricultural economy since long. Oilseed crops play the second most important role in Indian agricultural economy, next to food grains, in terms of area and production. Oilseeds production in India has increased with time, however, the increasing demand for edible oils necessitated the imports in large quantities, leading to a substantial drain of foreign exchange. The need for addressing this deficit motivated a systematic study of the oilseeds economy to formulate appropriate strategies to bridge the demand-supply gap. In this study, an effort is made to forecast oilseeds production by using Autoregressive Integrated Moving Average (ARIMA) model, which is the most widely used model for forecasting time series. One of the main drawbacks of this model is the presumption of linearity. The Group Method of Data Handling (GMDH) model has also been applied for forecasting the oilseeds production because it contains nonlinear patterns. Both ARIMA and GMDH are mathematical models well-known for time series forecasting. The results obtained by the GMDH are compared with the results of ARIMA model. The comparison of modeling results shows that the GMDH model perform better than the ARIMA model in terms of mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The experimental results of both models indicate that the GMDH model is a powerful tool to handle the time series data and it provides a promising technique in time series forecasting methods.

  • Research Article
  • Cite Count Icon 85
  • 10.1016/j.procs.2018.10.526
Roll motion prediction using a hybrid deep learning and ARIMA model
  • Jan 1, 2018
  • Procedia Computer Science
  • Novri Suhermi + 3 more

Roll motion prediction using a hybrid deep learning and ARIMA model

  • Research Article
  • Cite Count Icon 2
  • 10.29244/ijsa.v9i1p145-156
Forecasting Nonlinear Time Series with ARIMA, ANN, and Hybrid Models: A Case Study on Inflation Rate in Sri Lanka
  • Jun 24, 2025
  • Indonesian Journal of Statistics and Its Applications
  • W M Sudarshana Bandara + 1 more

In time series forecasting, hybrid models combining autoregressive integrated moving average (ARIMA) and artificial neural networks (ANNs) have gained prominence due to their ability to capture both linear and nonlinear patterns within data. ARIMA models are effective at modeling linear relationships, while ANNs are adept at handling complex nonlinear structures. However, each model has its limitations when used independently. This study presents a hybrid model that integrates the strengths of both ARIMA and ANN to forecast the monthly inflation rate in Sri Lanka using historical data from 1988 to 2018. Our findings demonstrate that the proposed hybrid model outperforms the standalone ARIMA and ANN models, particularly in terms of Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE). By leveraging the complementary strengths of ARIMA and ANN, this hybrid approach provides a robust forecasting framework for handling the diverse structural complexities of time series data

  • Research Article
  • Cite Count Icon 10
  • 10.5897/ijwree2017.0740
English
  • Sep 30, 2017
  • International Journal of Water Resources and Environmental Engineering
  • Makwinja Rodgers + 3 more

Stochastic models have proven to be practically fundamental in fields such as science, economics, and business, among others. In Malawi, stochastic models have been used in fisheries to forecast fish catches. Nevertheless, forecasting water levels in major lakes and rivers in Malawi has been given little attention despite the availability of ample historical data. Although previous multichannel seismic surveys revealed the presence of low stands (sediment bypass zone) in Lake Malawi indicating that since the beginning of its formation, important water level fluctuations have been occurring, these previous surveys failed to predict and highlight much more clearly the status of these levels in the future. Therefore, the main objective of the study was to fill these research gaps. The study used Autoregressive (AR), Moving Average (MA), Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average (ARIMA) processes to select the appropriate stochastic model. Based on lowest Normalized Bayesian Information Criterion (NBIC), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Mean Forecast Error (MFE), Maximum Absolute Percentage Error (MAXAPE), Maximum Absolute Error (MAXAE), and Mean Absolute Error (MAE) - ARIMA (0,1,1) model is found suitable for forecasting Lake Malawi water levels which shows negative trend up to 2035. The study further predicted that Lake Malawi water levels will decrease from the current average level of 472.97 m to an average of 468.63 m for the next 18 years (up to 2035). Key words: Forecasting, Lake Malawi, modelling, stochastic, time series, water levels.

  • Research Article
  • Cite Count Icon 6
  • 10.7176/jesd/10-23-02
Forecasting GDP Growth Rates of Bangladesh: An Empirical Study
  • Dec 1, 2019
  • Journal of Economics and Sustainable Development
  • Liton Chandra Voumik + 3 more

The Gross Domestic Product (GDP) is the market value of all goods and services produced within the boundary of a nation in a year. This paper aims to apply time series tools and forecast GDP growth in the Bangladesh economy. Forecasting of time series is an important topic in macroeconomics. We collected the data from World Development Indicators (WDI) and it has been collected over a period of 37 years by WDI, World Bank. Augmented Dickey–Fuller (ADF) and Phillips–Perron (PP) tests were applied to investigate the stationary character of the data. Stata and R statistical software was used to build a class of Autoregressive Integrated Moving Average (ARIMA) and exponential smoothing methods to model the GDP growth. We applied several ARIMA (P, I, Q) models and employed the ARIMA (1,1,1) model as best for forecasting. This ARIMA (1,1,1) model was chosen based on the minimum values of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Also, we applied the Exponential Smoothing to forecast the GDP growth rate. In addition, among the Exponential Smoothing models, the triple exponential model better analyzed the data based on lowest Sum of Square Error (SSE) and Root Mean Square Error (RMSE). Using these models, the values of future GDP growth rates are forecasted. Statistical results show that Bangladesh’s GDP growth rate is an increasing trend that will continue rising in the future. This finding will help policymakers and academicians to formulate economic and business strategies more precisely . Keywords: Stationary time series, ARIMA, Time Series Forecasting, Exponential Smoothing, GDP growth rate, GDP growth in Bangladesh DOI : 10.7176/JESD/10-23-02 Publication date: December 31 st 2019

  • Research Article
  • 10.22131/sepehr.2019.35646
The Comparison of ARIMA and Neural Network methods for Modeling and Monitoring of Drought Using Remote Sensing Time Series Data (Case Study: City of Arak)
  • May 22, 2019
  • Mohammad Mahdi Khoshgoftar + 2 more

Introduction Drought is a critical climate condition affecting many places on Earth. Drought severity is often measured using a combination of different variables including rainfall, temperature, humidity, wind, soil moisture, and steam flow. During the last decades, Iran has suffered from drought conditions and it may suffer more in future. The frequent occurrence of drought in Iran is mainly due to lack of sufficient precipitation and improper water management system. Drought is often categorized into three types: meteorological, agricultural, and hydrological. There are various methods for measuring and quantifying drought severity. The most commonly used ones are Palmer Drought Severity Index (PDSI) and Standardized Precipitation Index (SPI). Remotely sensed data can also be used for monitoring drought condition. The most widely used ones are Normalized Difference Vegetation Index (NDVI), Land Surface Temperature (LST), Vegetation Condition Index (VCI), Temperature Vegetation Index (TVX) and NDVI deviation Index (DEV). Neural Network (NN) and Autoregressive Integrated Moving Average (ARIMA) are two of the most widely applied methods for modeling and monitoring drought severity indices. In this paper, monthly time series data (2000 to 2014) of three remotely sensed indices (i.e., NDVI, VCI, and TVX) and one meteorological index (i.e., SPI) were applied for modeling drought severity. In addition, the NN and ARIMA were developed for modeling these indices. Materials & Methods Data used in this paper were the time series of NDVI, VCI, TVX, and SPI. The study area in this paper was Arak, center of Markazi province. It has cold and wet winters with warm and dry summers. ARIMA and NN were employed for modeling indices. ARIMA model is generally derived from three basic time series models: Autoregressive (AR), Moving Average (MA), and Autoregressive Moving Average (ARMA). These basic models are used with static time series, i.e., they have constant mean and covariance in relation to time. Usually, NN method has three layers. The first layer or the input layer introduces data to network. Input data is processed in the second layer or the hidden layer. Finally, the output layer produces the results of the input data. In this paper, single hidden layer feed forward network, which is the most widely utilized NN form, was employed for modeling indices. Results & Discussion After implementing NN and ARIMA models on the time series data, the performance of the models was evaluated using Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The RMSE obtained by NN and used for modeling NDVI, VCI, TVX, and SPI indices of Arak were 0.1944, 0.2191, 0.1295, and 0.2990, respectively. In addition, RMSE obtained from ARMIA, and used for modeling these indices were 0.0770, 37.2318, 0.2658, and 1.3370. In another experiment, the correlation between remotely sensed indices and SPI was studied. Among the remotely sensed indices, TVX shows the most powerful correlation with SPI. Conclusion In the present study, drought condition in the central region of Markazi province was studied during the 2000 to 2014 period. We used the time series of remotely sensed data (such as LST and NDVI) and meteorological data (such as SPI). Then TVX, VCI, and DEV indices were extracted from NDVI and LST data. NN and ARIMA were applied for modeling time series data. Based on the findings, it is concluded that NN is more successful and efficient than ARIMA for this study area. In addition, TVX, which is built based on NDVI and LST, had the most powerful correlation with SPI. This issue implies that both vegetation index and temperature index had an important role in modeling and monitoring drought condition.

  • Research Article
  • 10.37591/rrjost.v7i3.1688
The Comparison in Time Series Forecasting of Air Traffic Data by Autoregressive Integrated Moving Average Model, Radial Basis Function and Elman Recurrent Neural Networks
  • Feb 13, 2019
  • R S Ramakrishna + 2 more

Nowadays , nonlinear time series and artificial neural networks (ANN) models are used for forecasting in the field of business, agriculture and soon. Recent studies have shown, ANN have been successfully used for forecasting of financial and agriculture data series The classical methods used for time series prediction like Box-Jenkins or ARIMA assumes that there is a linear relationship between inputs and outputs. ANN have more advantages that can approximate to model both linear and nonlinear structures in time series, they are not able to handling both structures equally well. The autoregressive integrated moving average (ARIMA) model and two ANN models namely, Radial basis function neural networks (RBFNN), and Elman recurrent neural networks (ERNN) methods were applied to Hyderabad airport traffic data. The data obtained for 15 years from 2002–2003 to 2016–2017 about domestic and international passenger of International Airport of Hyderabad, India. In this research paper, we compared the performances of ARIMA, RBFNN and ERNN were based on three measures: mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The results showed that RBFNN obtained the smallest MAE, MAPE and RMSE in both the modeling and forecasting processes. The performances of the three models ranked in ascending order were: ARIMA, ERNN and the RBFNN model. Keywords: T ime series, forecasting, artificial neural networks, ARIMA models, radial basis function neural networks, and Elman recurrent neural networks Cite this Article R. Ramakrishna, Berhe Aregay, Tewodros Gebregergs. The Comparison in Time Series Forecasting of Air Traffic Data by Autoregressive Integrated Moving Average Model, Radial Basis Function and Elman Recurrent Neural Networks. Research & Reviews: Journal of Statistics . 2018; 7(3): 75–90p.

  • Research Article
  • Cite Count Icon 259
  • 10.1016/j.knosys.2010.07.006
Forecasting time series using a methodology based on autoregressive integrated moving average and genetic programming
  • Jul 17, 2010
  • Knowledge-Based Systems
  • Yi-Shian Lee + 1 more

Forecasting time series using a methodology based on autoregressive integrated moving average and genetic programming

  • Research Article
  • Cite Count Icon 273
  • 10.1016/s0261-5177(01)00098-x
Time series forecasts of international travel demand for Australia
  • Nov 7, 2001
  • Tourism Management
  • Christine Lim + 1 more

Time series forecasts of international travel demand for Australia

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/iccse.2012.6295015
Emotional prediction using time series multiple-regression genetic algorithm for autistic syndrome disorder
  • Jul 1, 2012
  • Teik-Toe Teoh + 2 more

Time series forecasting is an active research area that has drawn considerable attention for applications in a variety of areas. Auto-Regressive Integrated Moving Average (ARIMA) models are one of the most important time series models used in financial market forecasting over the past three decades. Recent research activities in time series forecasting indicate that two basic limitations detract from their popularity for financial time series forecasting: (a) ARIMA models assume that future values of a time series have a linear relationship with current and past values as well as with white noise, so approximations by ARIMA models may not be adequate for complex nonlinear problems; and (b) ARIMA models require a large amount of historical data in order to produce accurate results. Both theoretical and empirical findings have suggested that integration of different models can be an effective method of improving upon their predictive performance, especially when the models in the ensemble are quite different. In this paper, ARIMA models are integrated with Artificial Neural Networks (ANNs) and Fuzzy logic in order to overcome the linear and data limitations of ARIMA models, thus obtaining more accurate results. Empirical results of forecasting model indicate that the hybrid models exhibit effectively improved forecasting accuracy so that the model proposed can be used as an alternative to financial market forecasting tools. In this paper, experiments were conducted to confirm these hypotheses by evaluating the predictive capability of the developed ensemble of models in the domain of emotion prediction. This work attempts to anticipate subsequent emotion given historical emotions recorded.

  • Research Article
  • Cite Count Icon 8
  • 10.4156/jcit.vol8.issue4.8
Application of a Hybrid ARIMA and Neural Network Model to Water Quality Time Series Forecasting
  • Feb 28, 2013
  • Journal of Convergence Information Technology
  • Han Yan - + 1 more

In this paper the water quality forecasting at the Nanjinguan water quality monitoring station of Yangtze River, China, is presented. The time series data used are weekly water quality data obtained directly from Nanjinguan station measurements over the course of five years. In order to forecast water quality, hybrid models consisting of Autoregressive Integrated Moving Average (ARIMA) models and Artificial Neural Network (ANNs) models were developed. The ARIMA models were first used to do water quality forecasting of the time series data and then with the obtained errors ANNs were built taking into account the nonlinear patterns that the ARIMA technique could not capture, in order to reduce potential errors. Once the hybrid models were developed 38 samples out of the data for the station were used to do the water quality forecasting and the results were compared with the ARIMA and the ANNs models worked separately. Statistical error measures such as the root mean square error (RMSE), the mean absolute percentage error (MAPE) and correlation coefficient (R) were calculated to compare the three methods. The results showed that the hybrid models predict the water quality with a higher accuracy than the ARIMA and ANNs models in the examined station.

  • Book Chapter
  • Cite Count Icon 9
  • 10.1007/978-981-15-4032-5_8
Annual Rainfall Prediction Using Time Series Forecasting
  • Jan 1, 2020
  • Asmita Mahajan + 2 more

This paper attempts to determine which one of the various univariate forecasting techniques is producing accurate and statistically compelling forecasts for rainfall. The term “univariate time series” is referred as a time series that consists of a sequence of measurements of the same variable collected over regular time intervals. Forecasting techniques to predict rainfall are an important aspect as they are useful for business purposes, to take into account the transportation hazards that is a result of heavy rainfall, also it helps farmers and gardeners to plan for crop irrigation and protection. Most commonly, the techniques for prediction are regression analysis, clustering, autoregressive integrated moving average (ARIMA), error, trend, seasonality (ETS) and artificial neural network (ANN). In this paper, a review is provided based on different rainfall prediction techniques for predicting rainfall as early as possible. This paper has compared the performance of various forecasting techniques such as ARIMA, ANN, ETS based on accuracy measures like mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE) and mean absolute error (MAE). On comparing these techniques, it is evident that ARIMA is performing well on the given data.

  • Research Article
  • Cite Count Icon 2
  • 10.32508/stdjelm.v3i1.540
Forecasting stock index based on hybrid artificial neural network models
  • Jun 5, 2019
  • Science & Technology Development Journal - Economics - Law and Management
  • Ta Quoc Bao + 3 more

Forecasting stock index is a crucial financial problem which is recently received a lot of interests in the field of artificial intelligence. In this paper we are going to study some hybrid artificial neural network models. As main result, we show that hybrid models offer us effective tools to forecast stock index accurately. Within this study, we have analyzed the performance of classical models such as Autoregressive Integrated Moving Average (ARIMA), Artificial Neural Network (ANN) model and the Hybrid model, in connection with real data coming from Vietnam Index (VNINDEX). Based on some previous foreign data sets, for most of the complex time series, the novel hybrid models have a good performance comparing to individual models like ARIMA and ANN. Regarding Vietnamese stock market, our results also show that the Hybrid model gives much better forecasting accuracy compared with ARIMA and ANN models. Specifically, our results tell that the Hybrid combination model delivers smaller Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) than ARIMA and ANN models. The fitting curves demonstrate that the Hybrid model produces closer trend so better describing the actual data. Via our study with Vietnam Index, it is confirmed that the characteristics of ARIMA model are more suitable for linear time series while ANN model is good to work with nonlinear time series. The Hybrid model takes into account both of these features, so it could be employed in case of more generalized time series. As the financial market is increasingly complex, the time series corresponding to stock indexes naturally consist of linear and non-linear components. Because of these characteristic, the Hybrid ARIMA model with ANN produces better prediction and estimation than other traditional models.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.heliyon.2023.e13782
Using weather factors and google data to predict COVID-19 transmission in Melbourne, Australia: A time-series predictive model
  • Feb 21, 2023
  • Heliyon
  • Hannah Mcclymont + 2 more

Using weather factors and google data to predict COVID-19 transmission in Melbourne, Australia: A time-series predictive model

Save Icon
Up Arrow
Open/Close