Abstract

Due to the advancements in information technologies, massive quantity of data is being produced by social media, smartphones, and sensor devices. The investigation of data stream by the use of machine learning (ML) approaches to address regression, prediction, and classification problems have received considerable interest. At the same time, the detection of anomalies or outliers and feature selection (FS) processes becomes important. This study develops an outlier detection with feature selection technique for streaming data classification, named ODFST-SDC technique. Initially, streaming data is pre-processed in two ways namely categorical encoding and null value removal. In addition, Local Correlation Integral (LOCI) is used which is significant in the detection and removal of outliers. Besides, red deer algorithm (RDA) based FS approach is employed to derive an optimal subset of features. Finally, kernel extreme learning machine (KELM) classifier is used for streaming data classification. The design of LOCI based outlier detection and RDA based FS shows the novelty of the work. In order to assess the classification outcomes of the ODFST-SDC technique, a series of simulations were performed using three benchmark datasets. The experimental results reported the promising outcomes of the ODFST-SDC technique over the recent approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call