Abstract

Deep neural networks have demonstrated outstanding performance in various image classification tasks, yet action recognition from still images is challenging due to lack of temporal information, scarcity of data, large intra-class variations, and high similarity between different actions. To overcome these problems, ensemble learning appears to be one of the most straightforward approaches. However, ensemble models are computationally inefficient, especially when they are based on a combination of deep neural networks. In order to deal with this issue, we first construct an ensemble model by combining three separately trained deep convolutional neural networks. Then by utilizing this ensemble model as a teacher we train a light-weight student network based on the new knowledge distillation framework. This light-weight model achieves 94.32% MAP on the Stanford40 dataset, demonstrating its superiority over many existing methods despite its high computational efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call