Abstract
Nowadays, many real-world applications of our daily life generate massive volume of streaming data at a higher speed than ever before, to name a few, Web clicking data streams, sensor network data and credit transaction streams. Contrary to traditional data mining using static datasets, there are several challenges for data stream mining, for instance, finite memory, one-pass and timely reaction. In this survey, we provide a comprehensive review of existing multi-label streams mining algorithms and categorize these methods based on different perspectives, which mainly focus on the multi-label data stream classification. We first briefly summarize existing multi-label and data stream classification algorithms and discuss their merits and demerits. Secondly, we identify mining constraints on classification for multi-label streaming data, and present a comprehensive study in algorithms for multi-label data stream classification. Finally, several challenges and open issues in multi-label data stream classification are discussed, which are worthwhile to be pursued by the researchers in the future.
Highlights
In today’s world, many organizations generate massive data at an unprecedented high speed
We discover that the subset accuracy optimized by RAndom k-labELsets is only measured by the k-labELsets rather than the whole label space
DATA STREAM CLASSIFICATION at first, we introduce the related concepts on data stream and data stream classification, and we introduce the common constrains in data stream classification and mainly address the concept drift issues in data stream classification
Summary
In today’s world, many organizations generate massive data at an unprecedented high speed. Just in one day, Google processes over 3.5 billion searching records, NASA satellites produce about 4 Terabyte images data, and WalMart supermarkets generate over 20 million transaction records. Online social networks such as Twitter and Tencent Weibo have attracted more attention, which generate a huge number of online data streams, including texts, images and videos, etc [3]–[7]. These data are of the following new characteristics, such as high-volume, high-velocity, concept drift and especially multi-label, and we call them multi-label data streams.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.