Abstract

Classification over data streams is a crucial task of explosive social stream mining and computing. Efficient learning techniques provide high-quality services in the aspect of content distribution and event browsing. Due to the concept drift and concept evolution in data streams, the classification performance degrades drastically over time. Many existing methods utilize supervised and unsupervised learning strategies. However, supervised strategies require labeled emerging records to update the classifiers, which is unfeasible to work in the practical social stream applications. Although unsupervised strategies are popularly applied to detect concept evolution, it takes tremendous run-time computation cost to run online clustering. To this end, in this paper, we address these major challenges of social stream classification by proposing an efficient incremental semi-supervised classification method named CODES (Classification Over Drifting and Evolving Stream). The proposed CODES method consists of an efficient incremental semi-supervised learning module and a dynamic novelty threshold update module. Thus, in the drifting and evolving social streams, CODES is able to provide: 1) semi-supervised learning ability to eliminate dependency on the labels of emerging records; 2) fast incremental learning with real-time update ability to tackle concept drift; 3) efficient novel class detection ability to tackle concept evolution. Extensive experiments are conducted on several real-world datasets. The results indicate a higher performance than several state-of-the-art methods. CODES achieves efficient learning performance over drifting and evolving social streams, which improves practical significance in the real-world social stream applications.

Highlights

  • In this era of explosive information distribution, social media platforms release feeds of up-to-date information and user-generated content to the timeline of users

  • Designing an efficient social stream classification faces the following thriving challenges: 1) emerging records have no training labels, which requires a semi-supervised learning method to eliminate dependency on social record labels; 2) the centroids of classes keep drifting over time, which requires an incremental learning method with real-time update strategy; 3) records of novel classes emerge, which requires an efficient novel class detection method. To address these challenges simultaneously, we propose a social stream classification method CODES (Classification Over Drifting and Evolving Stream), which consists of: 1) an incremental semi-supervised classification module based on an extremely fast learning method named Extreme Learning Machine (ELM) [8], [9]; 2) and an efficient novel class detection module

  • In [26], the feature-level changes in data streams are studied by proposing a dynamic feature mask clustering method. Different from these related work, in this paper, we focus on solving the social stream classification problem by a incremental semi-supervised learning method, which simultaneously provides the abilities of semi-supervised learning for feasibility in real-world application environment, incremental and extremely fast update for concept drifting, and efficient novel class detection for class evolution

Read more

Summary

Introduction

In this era of explosive information distribution, social media platforms release feeds of up-to-date information and user-generated content to the timeline of users. The key to enhance the stickiness of content subscribers is to improve the quality of content distribution and event browsing quality, in which social stream classification plays a critical role to identify the topic of each event or record. The associate editor coordinating the review of this manuscript and approving it for publication was Shirui Pan. A major issue of the social stream classification problem is concept drift and evolution [1]. Concept drift occurs when the emerging social records have drifted content centroids. Concept evolution indicates class label space changes when records with new topics emerge in the social stream. Social stream classification urges incremental update mechanism and novelty detection method to provide the abilities of both dynamic adaption to content changes and novel class detection

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.