LB-CNN: Convolutional Neural Network with Latent Binarization for Large Scale Multi-class Classification

Timothy Reese,Michael Zhu

doi:10.1109/icmla51294.2020.00031

Abstract

Convolutional Neural Networks (CNNs) demonstrate state of the art performance in large scale multi-class image classification tasks. CNNs consist of convolution layers that progressively construct features and a classification layer. Typically, a softmax function is used in the classification layer to learn joint probabilities for the classes, which are subsequently used for class prediction. We refer to such an approach as the joint approach to multi-class classification. There exists another approach in the literature which determines the multi-class prediction outcome through a sequence of binary decisions, and is christened the class binarization approach. A popular type of class binarization is Error Correcting Output Codes (ECOC). In this paper, we propose to incorporate ECOC into CNNs by inserting a latent-binarization layer in a CNN's classification layer. This approach encapsulates both encoding and decoding steps of ECOC into a single CNN capable of discovering an optimal coding matrix during training. The latent-binarization layer is motivated by the family of latent-trait and latent-class models used in behavioral research. We call the proposed CNNs with Latent Binarization as LB-CNNs, and develop algorithms combining EM and back-propagation to train LB-CNNs. The proposed models and algorithms are applied to several image recognition tasks, producing excellent results. Furthermore, LB-CNNs can also enhance the interpretability of the decision process of CNNs.

Full Text