Abstract

Anomalies are subsequences that exhibit departures from normal state of operation. In this paper, to solve the problems of unknown data distribution, control limit determination, multiple parameters, training data and fuzziness of 'anomaly', a self-adaptive and unsupervised model is developed for finding anomalies in data streams. A salient feature is a synergistic combination of both statistical and fuzzy set-based techniques. Anomaly detection problem is viewed as a certain statistical hypothesis testing which is realized in an unsupervised mode. At the same time, 'anomaly' is a much more complex concept and as such can be described with fuzzy set theory. Fuzzy sets bring a facet of robustness to the overall scheme and play an important role in the successive step of hypothesis testing. Because of the fuzzification, parameters determination is self-adaptive and no parameter needs to be specified by the user, what's more, there is no need to consider the data distribution in statistical hypothesis testing in this paper. The approach is validated with a number of experiments, which help to quantify the performance of constructed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call