Abstract

In the domain of machine learning and predictive analytics, classifiers hold significant importance as they are instrumental in extracting valuable patterns and enabling precise predictions from data. However, data distributions in streams are rarely stable in real-world circumstances and may evolve over time. Concept drifts, or dynamic changes in the data, have a major impact on how well classifiers predict outcomes. Furthermore, virtual concept drifts have become equally essential when dealing with the development of virtual environments and synthetic data production tools. This study employs a range of conventional classifiers, and ensemble models to examine the effects of real and virtual concept drifts on the predictive performance of classifiers. In order to ensure the conclusions are applicable to a wide range of domains, the study will be undertaken on a variety of datasets. The generation of incremental data streams from historical records with drifting patterns will replicate real concept drifts. Synthetic data will be incorporated into the training and testing sets to introduce virtual concept drifts. The classifier models are then assessed using the benchmark metrics for performance on these datasets. Comparative evaluations are conducted to evaluate the classifiers’ robustness and adaptability in the face of real and virtual concept drifts. The results of this study will shed light on the performance of various classifiers and models in dynamic data environments while offering helpful insights to businesses for developing machine learning applications that are more resilient. The research examined the impact of real and virtual concept drifts on classifier performance, using three models: Streaming Random Patches (SRP), Adaptive Windowing (ADWIN), and Streaming Random Patches with Adaptive Windowing (SRP-ADWIN). The study found that the SRP-ADWIN ensemble model outperformed SRP and ADWIN individually, effectively adapting to changing data distributions, and showed improved predictive accuracy of 98.63% for virtual drift and 86.46% for real drift.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call