On empirical tradeoffs in large scale hierarchical classification

Rohit Babbar,Ioannis Partalas,Cecile Amblard,Eric Gaussier

doi:10.1145/2396761.2398625

Abstract

While multi-class categorization of documents has been of research interest for over a decade, relatively fewer approaches have been proposed for large scale taxonomies in which the number of classes range from hundreds of thousand as in Directory Mozilla to over a million in Wikipedia. As a result of ever increasing number of text documents and images from various sources, there is an immense need for automatic classification of documents in such large hierarchies. In this paper, we analyze the tradeoffs between the important characteristics of different classifiers employed in the top down fashion. The properties for relative comparison of these classifiers include, (i) accuracy on test instance, (ii) training time (iii) size of the model and (iv) test time required for prediction. Our analysis is motivated by the well known error bounds from learning theory, which is also further reinforced by the empirical observations on the publicly available data from the Large Scale Hierarchical Text Classification Challenge. We show that by exploiting the data heterogenity across the large scale hierarchies, one can build an overall classification system which is approximately 4 times faster for prediction, 3 times faster to train, while sacrificing only 1% point in accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On empirical tradeoffs in large scale hierarchical classification

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Oct 29, 2012
Citations: 1	License type: other-oa

Similar Papers

The Effect of Dimensionality Reduction on Large Scale Hierarchical Classification
Aris Kosmpoulos ... Georgios Paliouras
-
Aris Kosmpoulos, et. al.Aris Kosmpoulos ... Georgios Paliouras
01 Jan 2014
01 Jan 2014

Anomalies and Fayet-Iliopoulos terms on warped orbifolds and large hierarchies
Takayuki Hirayama ... Koichi Yoshioka
Journal of High Energy Physics | VOL. 2004
Takayuki Hirayama, et. al.Takayuki Hirayama ... Koichi Yoshioka
19 Jan 2004
Journal of High Energy Physics | VOL. 2004

Research on Large Scale Hierarchical Classification Based on Candidate Search
Li He ... Yan Jia
-
Li He, et. al.Li He ... Yan Jia
01 Nov 2013
01 Nov 2013

The ECIR 2010 large scale hierarchical classification workshop
A Kosmopoulos ... E Gaussier
ACM SIGIR Forum | VOL. 44
A Kosmopoulos, et. al.A Kosmopoulos ... E Gaussier
18 Aug 2010
The ECIR 2010 large scale hierarchical classification workshop
A Kosmopoulos ... E Gaussier

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On empirical tradeoffs in large scale hierarchical classification

Abstract

Talk to us

Similar Papers