Abstract
Nowadays, real-time applications are growing enormously in every field which lead to generation of real-time data streams to a large extent. The classification and clustering of this data is a challenging task within the limited memory and time constraints. Most of the existing approaches of classifying data streams use supervised learning which required training with labeled data only. This paper presents a data stream classification technique which considers both unlabeled and labeled data. The technique is developed by building different models of chunked data streams. These models are built as micro-clusters, and k-nearest neighbor algorithm is used for data stream classification. An ensemble of these models is viable for classification of the test stream instances. The two different approaches are presented for classification. First approach is max weighted-sum inter-submodel (MWSISM), and second is high-frequency max-weight intra-submodel (HFMWISM). Comparison is performed for two approaches of labeling the instances or classification. Experiment results on the dataset justify that the HFMWISM approach is effective over MWSISM for data stream classification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.