Can computer vision problems benefit from structured hierarchical classification?

Thomas Hoyoux,Justus H Piater,Antonio J Rodríguez-Sánchez

doi:10.1007/s00138-016-0763-9

Thomas Hoyoux, Justus H Piater + Show 1 more

Open Access

https://doi.org/10.1007/s00138-016-0763-9

Copy DOI

Abstract

Research in the field of supervised classification has mostly focused on the standard, so-called classification approach, where the problem classes live in a trivial, one-level semantic space. There is however an increasing interest in the hierarchical classification approach, where a performance gain is expected by incorporating prior taxonomic knowledge about the classes into the learning process. Intuitively, the hierarchical approach should be beneficial in general for the classification of visual content, as suggested by the fact that humans seem to organize objects into hierarchies based on visually perceived similarities. In this paper, we provide an analysis that aims to determine the conditions under which the hierarchical approach can consistently give better performances than the flat approach for the classification of visual content. In particular, we (1) show how hierarchical methods can fail to outperform flat methods when applied to real vision-based classification problems, and (2) investigate the underlying reasons for the lack of improvement, by applying the same methods to synthetic datasets in a simulation. Our conclusion is that the use of high-level hierarchical feature representations is crucial for obtaining a performance gain with the hierarchical approach, and that poorly chosen prior taxonomies hinder this gain even though proper high-level features are used.

Highlights

Most of the theoretical work and applications in the field of supervised classification have been dedicated to the standard classification approach, where the problem classes are considered to be different from each other in a semantic sense [28]
For maximum margin-based regression (MMR), using a polynomial kernel brings the best performances, and the core parameter for MMR is the degree of this polynomial kernel
To avoid overloading the reader with excessive experimentation, we only show and discuss results relative to the methods Structured output K-nearest neighbors (SkNN) and structured output support vector machine (SSVM), and their respective flat counterparts k-nearest neighbors (kNN) and MKSVM

Summary

Introduction

Most of the theoretical work and applications in the field of supervised classification have been dedicated to the standard classification approach, where the problem classes are considered to be different from each other in a semantic sense [28] In this standard approach, known as “flat” classification, a classifier is learned from class-labeled data instances without any explicit information given about the high-level semantic relationships between the classes. One could consider that ants and bees are part of a superclass of insects, while hammers belong to another superclass of tools, and it is intuitive that such hierarchical knowledge about the classes can help improve the classification performances Based upon this realization, a new approach has emerged for dealing more efficiently with classification of content deemed to be inherently semantically hierarchical, i.e., the hierarchical classification approach [28]. The attention given to the hierarchical approach was sustained by the advances made in machine learning generalized to arbitrary output spaces, i.e., the structured classification approach (e.g., [31]), of which the hierarchical approach is a special case

Methods

Results

Discussion

Conclusion