Abstract

In the last few years, data streams have drawn lots of researchers’ attention due to their various applications, such as healthcare monitoring systems, fraud and intrusion detection, the internet of things (IoT), and financial market applications. A data stream is an unbounded sequence of data continually generated over time and is prone to evolution. Outliers in streaming data are the elements that significantly deviate from the majority of elements and then have to be detected as they may be error values or events of interest. Detection of outliers is a challenging issue in streaming data and is one of the most crucial tasks in data stream mining. Existing outlier detection methods for static data are unsuitable for use in data stream settings due to the unique characteristics of streaming data such as unpredictability, uncertainty, high-dimensionality, and changes in data distribution. Thus, in this paper, a novel ensemble learning framework called Ensemble-based Streaming Outlier Detection (ESOD) is presented to perfectly detect outliers over streaming data using a sliding window technique that is updated in response to the incoming events from the data streaming environment to overcome the concept evolution nature of streaming data. The proposed framework has three phases, namely the training phase, testing/offline phase, and outlier detection/online phase. A detection weighted vote technique is used to determine the final decisions for potential outliers. In the extensive experimental study, which was conducted on 11 real-world benchmark datasets, the proposed framework was assessed using many accuracy metrics. The experiment results showed that the proposed framework beats many other state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.