Learn class hierarchy using convolutional neural networks

Riccardo La Grassa,Nicola Landro,Ignazio Gallo

doi:10.1007/s10489-020-02103-6

Riccardo La Grassa, Nicola Landro + Show 1 more

Open Access

https://doi.org/10.1007/s10489-020-02103-6

Copy DOI

Abstract

A large amount of research on Convolutional Neural Networks (CNN) has focused on flat Classification in the multi-class domain. In the real world, many problems are naturally expressed as hierarchical classification problems, in which the classes to be predicted are organized in a hierarchy of classes. In this paper, we propose a new architecture for hierarchical classification, introducing a stack of deep linear layers using cross-entropy loss functions combined to a center loss function. The proposed architecture can extend any neural network model and simultaneously optimizes loss functions to discover local hierarchical class relationships and a loss function to discover global information from the whole class hierarchy while penalizing class hierarchy violations. We experimentally show that our hierarchical classifier presents advantages to the traditional classification approaches finding application in computer vision tasks. The same approach can also be applied to some CNN for text classification.

Highlights

In recent years researchers have become increasingly interested in the multi-label and hierarchical learning approaches, finding many applications to several domains, including classification [1, 2], image annotation [3], bioinformatics [4,5,6,7]
In the presence of a structured output, the information is based on a different level of abstraction, while with the multi-label flat approach it is based on a single level
– We propose a new Hierarchical Deep Loss (HDL) function as an extension of convolutional neural networks to assign hierarchical multi-labels to images

Summary

Introduction

In recent years researchers have become increasingly interested in the multi-label and hierarchical learning approaches, finding many applications to several domains, including classification [1, 2], image annotation [3], bioinformatics [4,5,6,7]. Human beings perceive the world with different types of granularity and can translate information from coarse-grained to finegrained and on the contrary, perceiving different levels of abstraction of the information acquired [9, 10]. This concept is reflected in the taxonomy of the multi-label. In terms of neural models, the main difference between the prediction of structured output and flat multi-label classification lies in the level of neurons that contains the label prediction. In the presence of a structured output, the information is based on a different level of abstraction, while with the multi-label flat approach it is based on a single level

Objectives

Methods

Results

Conclusion