Impact of Starting Outlier Removal on Accuracy of Time Series Forecasting

Vadim Romanuke

doi:10.2478/sjpna-2022-0001

Abstract

Abstract The presence of an outlier at the starting point of a univariate time series negatively influences the forecasting accuracy. The starting outlier is effectively removed only by making it equal to the second time point value. The forecasting accuracy is significantly improved after the removal. The favorable impact of the starting outlier removal on the time series forecasting accuracy is strong. It is the least favorable for time series with exponential rising. In the worst case of a time series, on average only 7 % to 11 % forecasts after the starting outlier removal are worse than they would be without the removal.

Highlights

In time series analysis and forecasting, data preparation and preprocessing is a very important phase before obtaining factual forecasts
The goal is to determine the impact of the starting outlier removal on the time series forecasting accuracy
A time series can be only smoothed with a purpose to eliminate high-frequency fluctuations which are most probably consequences of true randomness [8, 15, 17, Impact of starting outlier removal on accuracy of time series forecasting

Summary

INTRODUCTION

In time series analysis and forecasting, data preparation and preprocessing is a very important phase before obtaining factual forecasts. If the starting point in the time series is an outlier, the result of smoothing may be unsatisfactory. How much does it negatively influence the forecasting accuracy? The goal is to determine the impact of the starting outlier removal on the time series forecasting accuracy. 3. To define a set of benchmark time series for testing the forecasting accuracy before and after the starting outlier removal. A time series can be only smoothed with a purpose to eliminate high-frequency fluctuations which are most probably consequences of true randomness [8, 15, 17, Impact of starting outlier removal on accuracy of time series forecasting. Graphical examples of three benchmark time series per pattern forecasted the worst are presented in fig. The worst and best forecasts prior to the removal (squares) and after it (circles) are shown in fig

Findings

DISCUSSION

CONCLUSION