Convolutional neural networks (CNNs) have enabled tremendous achievements in image classification, as the model can automatically extract image features and assign a proper classification. Nevertheless, the classification is lacking robustness to — for humans’ invisible perturbations on the input. To improve the robustness of the CNN model, it is necessary to understand the decision-making procedure of CNN models. By inspecting the learned feature space, we found that the classification regions are not always clearly separated by the CNN model. The overlap of classification regions increases the possibility to less perturbation induced input changes on classification results. Therefore, the clear separation of feature spaces of the CNN model should support decision robustness. In this paper, we propose to use a novel loss function called “conformity loss” to strengthen disjoint feature spaces during learning at different layers of the CNN, in order to improve the intra-class compactness and inter-class differences in trained representations. The same function was used as an evaluation metric to measure the feature space separation during the testing process. In conclusion, the conformity loss driven trained model has shown better feature space separation at comparable output performance.
Read full abstract