“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

Xingjian Li,Zeyu Chen,Dejing Dou,Jun Huan,Haoyi Xiong,Cheng-Zhong Xu

doi:10.1145/3473464

Abstract

Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” ( INE ) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process . We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.

Full Text