Abstract

Outliers are unexpected observations, which deviate from the majority of observations. Outlier detection and prediction are challenging tasks, because outliers are rare by definition. A stream is an unbounded source of data, which has to be processed promptly. This article proposes novel methods for outlier detection and outlier prediction in streams of sensor data. The outlier detection is an independent, unsupervised process, which is implemented using an autoencoder. The outlier detection continuously evaluates if the latest data point mathbf {x}_i from a stream is an inlier or an outlier. This distinction is based on the reconstruction cost accompanied with Chebyshev’s inequality and the EWMA (exponentially weighted moving average) model. The outlier prediction uses the results of the outlier detection to form the required training data. The outlier prediction utilizes LR (logistic regression), SGD (stochastic gradient descent) and the hidden representation provided by the autoencoder to predict outliers in streams. The results of the experiments show that the proposed methods (1) provide accurate results, (2) are calculated in reduced computation time and (3) use a low amount of memory. Our proposed methods are suitable for analyzing streams of sensor data and providing results with low latency. The experiments also indicated that the outlier prediction is able to anticipate the occurrence of outliers in streams of sensor data.

Highlights

  • Outliers are unexpected observations, which deviate significantly from the expected observations and typically correspond to critical events [22,40,82]

  • The results show that the dataset types (GAUSS1–GAUSS4) are ordered by the challenge of the outlier detection task

  • This is evident as the average area under ROC (AUROC) increases consistently between GAUSS1 and GAUSS4 dataset types

Read more

Summary

Introduction

Outliers are unexpected observations, which deviate significantly from the expected observations and typically correspond to critical events [22,40,82]. 90571 Oulu, Finland 3 Department of Electrical Engineering and Computer. The prediction of the occurrence of the outliers is useful [27,57,80]. In the context of sensor networks, the prediction of the occurrence of outliers could provide a prior indication of mechanical faults [22] and sensor faults [93]. The prediction of the occurrence of outliers is challenging, because outliers are observed infrequently [35] and the training data for detecting outliers is typically not available. This article calls the prediction of the occurrence of outliers, or rare events, as outlier prediction

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call