LB-CNN: Convolutional Neural Network with Latent Binarization for Large Scale Multi-class Classification

Timothy Reese,Yu Michael Zhu

doi:10.1007/978-981-16-3357-7_8

Abstract

Abstract Convolutional Neural Networks (CNNs) demonstrate state-of-the-art performance in large-scale multi-class image classification tasks. CNNs consist of convolution layers that progressively construct features and a classification layer. Typically, a softmax function is used in the classification layer to learn joint probabilities for the classes, which are subsequently used for class prediction. We refer to such an approach as the joint approach to multi-class classification. There exists another approach in the literature which determines the multi-class prediction outcome through a sequence of binary decisions, and is christened the class binarization approach. A popular type of class binarization is error-correcting output codes. In this paper, we propose to incorporate error-correcting output codes into convolutional neural networks by inserting a latent binarization layer in a CNN classification layer. This approach encapsulates both encoding and decoding steps of error-correcting output codes into a single CNN architecture that is capable of discovering an optimal coding matrix during training. The latent binarization layer is motivated by the family of latent-trait and latent-class models used in behavioral research. We call the proposed convolutional neural networks with Latent Binarization as LB-CNNs, and develop algorithms combining expectation maximization algorithms and back-propagation to train LB-CNNs. The proposed models and algorithms are applied to several image recognition tasks, producing excellent results. Furthermore, our model enhances the interpretability of the decision process of standard convolutional neural networks.

Full Text