Abstract

Detection of anomalies in streaming time series data has become an important research topic due to the wide range of possible applications, including the detection of extreme weather conditions, malicious attacks on protected facilities, monitoring unauthorised gas and oil leaks, illegal pipeline connections, power cable faults, and water and other environmental pollution. Rapid detection of abnormal conditions and identifying these critical events is essential to protect lives and assets. Therefore, developing appropriate systems and detection methods is an urgent task. Anomaly detection in streaming data is challenging due to its large volume and high speed, presence of noise, and non-stationarity of the signal (or ‘concept drift’). The latter significantly complicates the identification of differences between new ‘typical’ behaviour and abnormal events. Solving this problem requires the algorithm for processing such data to learn and adapt to changing conditions. The paper proposes a modification of the algorithm for detecting anomalies in time series. This algorithm provides early detection of abnormal series in a massive collection of non-stationary streaming time series data. Anomalies are observations that are highly improbable given the previous time series values. The proposed approach is based on the primary detection of the predictive limit for the typical system behaviour using the theory of extreme values, followed by checking for the abnormality of the following series using the sliding window technique. The time series parameters are used as input data and compared by density distribution to detect any significant changes in the distribution of the characteristics. It allows the decision-making model to automatically adapt to the changing environment according to detected changes. Since anomalies are, by definition, exceptions to the typical behaviour of a system, most of the available stored data should reflect that typical behaviour of the system in question. It is unnecessary to have representative samples of all possible types of the standard behaviour of a given system for the algorithm to work well. The basic idea is to have a warm-up data set to obtain initial values for the decision model parameters. It makes it possible to determine if there is any significant difference between the last typical behaviour and the new typical behaviour. The proposed algorithm demonstrates its performance under conditions of noisy non-stationary data in several time series classes. Keywords: concept drift, extreme value theory, multivariate time series, outlier detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.