Abstract
Learning from a nonstationary data stream is challenging, as a data stream is generally considered to be endless, and the learning model is required to be constantly amended for adapting the shifting data distributions. When it meets multi-label data, the challenge would be further intensified. In this study, an adaptive online weighted multi-label ensemble learning algorithm called MLDME (multi-label learning with distribution matching ensemble) is proposed. It simultaneously calculates both the feature matching level and label matching level between any one reserved data block and the new received data block, further providing an adaptive decision weight assignment for ensemble classifiers based on their distribution similarities. Specifically, MLDME abandons the most commonly used but not totally correct underlying hypothesis that in a data stream, each data block always has the most approximate distribution with that emerging after it; thus, MLDME could provide a just-in-time decision for the new received data block. In addition, to avoid an infinite extension of ensemble classifiers, we use a fixed-size buffer to store them and design three different dynamic classifier updating rules. Experimental results for nine synthetic and three real-world multi-label nonstationary data streams indicate that the proposed MLDME algorithm is superior to some popular and state-of-the-art online learning paradigms and algorithms, including two specifically designed ones for classifying a nonstationary multi-label data stream.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have