Abstract

Efficient anomaly detection mechanisms are becoming an urgent and critical topic in the presence of big data applications. In this paper, we propose a data-driven preprocessing scheme on anomaly detection that incorporates a dimensionality reduction algorithm and present a real-time learning idea for big data applications. Specifically, we make extensive use of the robust data preprocessing and a real-time data learning approach. The proposed robust data preprocessing scheme not only preserves the critical property of dimensionality reduction for high-dimensional data, but also introduces a robust detection boundary to the presence of outliers. The real-time learning method is inspired by online learning, which differs from batch based data processing that performs data learning on an entire batch of data set. Real-time learning aims to make progress with each example it looks at. Detailed discussions are provided for the justification of this scheme. A case study is presented to demonstrate the feasibility of the application of the proposed scheme.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call