A New Method to Detect Outliers in High-frequency Time Series

Ilaria Lucrezia Amerise,Agostino Tarsitano

doi:10.5539/ijsp.v8n1p16

Abstract

The objective of this research is to develop a fast, simple method for detecting and replacing extreme spikes in high-frequency time series data. The method primarily consists&nbsp; of a nonparametric procedure that pursues a balance between fidelity to observed data and smoothness. Furthermore, through examination of the absolute difference between original and smoothed values, the technique is also able to detect and, where necessary, replace outliers with less extreme data. Unlike other filtering procedures found in the literature, our method does not require a model to be specified for the data. Additionally, the filter makes only a single pass through the time series. Experiments&nbsp; show that the new method can be validly used as a data preparation tool to ensure that time series modeling is supported by clean data, particularly in a complex context such as one with high-frequency data.

Highlights

An important topic in time series analysis is how to deal with data that consist of on-the-minute, hourly, daily or weekly observations
The paper is organized as follows: we present the normalized linear filter (NLF) together with computation of the thresholds beyond which outliers are detected
Brooks et al (1988) applied the generalized cross-validation (GCV) score suggested in Golub & Wahba (1979)

Summary

Introduction

An important topic in time series analysis is how to deal with data that consist of on-the-minute, hourly, daily or weekly observations. Let pt ≥ 0 be the observed values at period t and n be the length of the time series pt, t = 1, 2, · · · , n. We detect extreme spikes (or outliers) by examining the absolute difference between observed values and the corresponding point in the reference curve. The function has two terms: goodness of fit and smoothness. () Fp measures fidelity to the data in terms of the squared deviations between smoothed and observed values. Fm is the maximum of F(p), which occurs when all m-th differences are equal to zero In this case, the reference curve is determined by fitting to p a polynomial of degree (m−1) by the lea(st)squares. 1, A very simple choice is λ = 0.5, which implies that fidelity and smoothness are balanced. The final section discusses our findings and points out some improvements for further applications

Optimal Smoothing

Choice of the Smoothing Constant

Detection of Extreme Spikes

Segmentation

Monte Carlo Analysis

SARIMA Models

Effects of Smoothing on Point Forecast Accuracy

Simulation Design

Findings

Conclusions and Future Research

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Statistics and Probability	Publication Date: Nov 19, 2018
Citations: 2	License type: CC BY 4.0

R Discovery Prime

A New Method to Detect Outliers in High-frequency Time Series

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Statistics and Probability

Lead the way for us

Similar Papers

Discussion of the Paper by Bruce and Martin
-
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 51
--
01 Jul 1989
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 51

Identifying Informative Nodes in Attributed Spatial Sensor Networks Using Attention for Symbolic Abstraction in a GNN-based Modeling Approach
Leonid Schwenke ... Martin Atzmueller
The International FLAIRS Conference Proceedings | VOL. 36
Leonid Schwenke, et. al.Leonid Schwenke ... Martin Atzmueller
08 May 2023
The International FLAIRS Conference Proceedings | VOL. 36

High-Frequency Time Series Data Regression Based on the Functional Bayesian Model Averaging
Jiarui Cui ... Shuo Yang
Highlights in Science, Engineering and Technology | VOL. 49
Jiarui Cui, et. al.Jiarui Cui ... Shuo Yang
21 May 2023
Highlights in Science, Engineering and Technology | VOL. 49

Robust Multivariate and Nonlinear Time Series Models

-

01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

A New Method to Detect Outliers in High-frequency Time Series

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Statistics and Probability