Softly combining an ensemble of classifiers learned from a single convolutional neural network for scene categorization

Shuang Bai,Huadong Tang

doi:10.1016/j.asoc.2018.03.007

Abstract

In this paper we propose to train an ensemble of classifiers from a single convolutional neural network (CNN) and softly combine these classifiers for scene categorization. Specifically, we explore the hierarchical structure of a CNN to extract multiple types of features from images, and train a multi-class classifier corresponding to each type of features. To combine these classifiers effectively, a soft combination strategy is introduced. Considering the fact that different images may need to be discriminated by using different types of features, we train a set of auxiliary binary-class classifiers to estimate the quality of categorizing an image by using the corresponding multi-class classifiers, so that a dynamic weight can be assigned to each of the multi-class classifiers for combination. On the other hand, because features extracted from different layers of a CNN differ largely in their levels of abstraction, classifiers trained based on these features have quite different capabilities for scene categorization. To address this issue, in the soft combination strategy we adopt the genetic algorithm to learn another set of static weights for the multi-class classifiers for combination. The static weights are to adapt the multi-class classifiers to given datasets. Finally, to categorize an image, the multi-class classifiers are combined by using both dynamic and static weights. We conduct experiments on two challenging benchmark datasets, MIT-indoor scene 67 and SUN397. Experiment results show that the proposed method is effective for scene categorization and can give superior results to state-of-the-art approaches.

Full Text