Abstract

New challenges have emerged in data mining as the traditional techniques have floundered with real-time data streams. The traditional technique needs refurbishing so as to acclimatize with concept drifting data streams. Thus dealing with the concept changes is the most imperative task of stream data mining. Ensemble classifiers have the ability to automatically adapt with the incoming drifts and, therefore, it is the most interesting research area in data stream mining. Bagging, Boosting and Random forest generation are the common ensemble techniques and are the most popular machine learning approaches in the current scenario for static data (Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):469–1495, [1]). A large number of base classifiers in an ensemble can cause computational overhead. Data mining classifiers for real-time data streams, therefore, need to be updated constantly and retrained with the labeled instances of the newly arrived novel classes in data streams and to cope with concept drift; otherwise, the mining models will become less and less accurate as time passes by. However, for data streams, adaptive random forest algorithms have been widely used for ensemble generation due to its competence to handle different types of drifts. This paper proposes a modified adaptive random forest with meta level learner algorithm and concept adaptive very fast decision tree to overcome the concept drift problem in real-time data streams. The proposed algorithm is experimentally compared with state-of-the-art adaptive random forest algorithm on several real synthetic datasets. Results indicate its efficiency in terms of accuracy and processing time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call