Abstract
Training of one-vs.-rest SVMs can be parallelized over the number of classes in a straight forward way. Given enough computational resources, one-vs.-rest SVMs can thus be trained on data involving a large number of classes. The same cannot be stated, however, for the so-called all-in-one SVMs, which require solving a quadratic program of size quadratically in the number of classes. We develop distributed algorithms for two all-in-one SVM formulations (Lee et al. and Weston and Watkins) that parallelize the computation evenly over the number of classes. This allows us to compare these models to one-vs.-rest SVMs on unprecedented scale. The results indicate superior accuracy on text classification data.
Highlights
Modern data analysis requires computation with a large number of classes
We address scaling up multi-class support vector machines (MC-SVMs) [1]
We proposed distributed algorithms for solving the multi-class SVM formulations by Lee et al (LLW) and Weston and Watkins (WW)
Summary
Modern data analysis requires computation with a large number of classes. As examples, consider the following. (1) We are continuously monitoring the internet for new webpages, which we would like to categorize. (2) We have data from an online biomedical bibliographic database that we want to index for quick access to clinicians. (3) We are collecting data from an online feed of photographs that we would like to classify into image categories. (4) We add new articles to an online encyclopedia and intend to predict the categories of the articles. (5) Given a huge collection of ads, we want to built a classifier from this data.The problems above—taken from varying application domains ranging from the sciences to technology—involve a large number of classes, typically at least in the thousands. Modern data analysis requires computation with a large number of classes. (2) We have data from an online biomedical bibliographic database that we want to index for quick access to clinicians. (3) We are collecting data from an online feed of photographs that we would like to classify into image categories. (4) We add new articles to an online encyclopedia and intend to predict the categories of the articles. (5) Given a huge collection of ads, we want to built a classifier from this data. The problems above—taken from varying application domains ranging from the sciences to technology—involve a large number of classes, typically at least in the thousands. This motivates research on scaling up multi-class classification methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.