Abstract
Extreme multi-label classification (XMC) refers to supervised multi-label learning involving hundreds of thousands or even millions of labels.In this paper, we develop a suite of algorithms, called Bonsai, which generalizes the notion of label representation in XMC, and partitions the labels in the representation space to learn shallow trees.We show three concrete realizations of this label representation space including: (i) the input space which is spanned by the input features, (ii) the output space spanned by label vectors based on their co-occurrence with other labels, and (iii) the joint space by combining the input and output representations. Furthermore, the constraint-free multi-way partitions learnt iteratively in these spaces lead to shallow trees.By combining the effect of shallow trees and generalized label representation, Bonsai achieves the best of both worlds—fast training which is comparable to state-of-the-art tree-based methods in XMC, and much better prediction accuracy, particularly on tail-labels. On a benchmark Amazon-3M dataset with 3 million labels, Bonsai outperforms a state-of-the-art one-vs-rest method in terms of prediction accuracy, while being approximately 200 times faster to train. The code for Bonsai is available at https://github.com/xmc-aalto/bonsai.
Highlights
Extreme Multi-label Classification (XMC) refers to supervised learning of a classifier which can automatically label an instance with a small subset of relevant labels from an extremely large set of all possible target labels
Our work generalizes the approach taken in many earlier works, which have represented labels only in the input space (Prabhu et al 2018; Wydmuch et al 2018), or only in the output space (Tsoumakas et al 208). We show that these representations, when combined with shallow trees, surpass existing methods demonstrating the efficacy of the proposed generalized representation
– The consistent improvement of Bonsai over Parabel on all datasets validates the choice of higher fanout and advantages of using shallow trees
Summary
Extreme Multi-label Classification (XMC) refers to supervised learning of a classifier which can automatically label an instance with a small subset of relevant labels from an extremely large set of all possible target labels. From the machine learning perspective, building effective extreme classifiers is faced with the computational challenge arising due to large number of (i) output labels, (ii) input training instances, and (iii) input features. Another important statistical characteristic of the datasets in XMC is that a large fraction of labels are tail labels, i.e., those which have very few training instances that belong to them ( referred to as power-law, fat-tailed distribution and Zipf’s law).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.