Efficient Still Image Action Recognition by the Combination of Ensemble Learning and Knowledge Distillation

Amir Hossein Saleknia,Ahmad Ayatollahi

doi:10.1109/icwr57742.2023.10138975

Abstract

Deep neural networks have demonstrated outstanding performance in various image classification tasks, yet action recognition from still images is challenging due to lack of temporal information, scarcity of data, large intra-class variations, and high similarity between different actions. To overcome these problems, ensemble learning appears to be one of the most straightforward approaches. However, ensemble models are computationally inefficient, especially when they are based on a combination of deep neural networks. In order to deal with this issue, we first construct an ensemble model by combining three separately trained deep convolutional neural networks. Then by utilizing this ensemble model as a teacher we train a light-weight student network based on the new knowledge distillation framework. This light-weight model achieves 94.32% MAP on the Stanford40 dataset, demonstrating its superiority over many existing methods despite its high computational efficiency.

Full Text