Soft Transferring and Progressive Learning for Human Action Recognition

Shenqiang Yuan,Yi He,Jin Zhang,Xue Mei

doi:10.1007/978-3-030-36189-1_18

Abstract

In action recognition, many different network structures of spatiotemporal features extractions has been proposed, and performed well on several mainstream datasets. Inevitably, one question occurs to us: is there any transferable characterizes between different models? In this paper, we discuss such problem by introducing a cross-architecture transferring learning scheme, dubbed soft transferring learning, aiming to overcome the limitation of divergence of different network structures. A multi-stage semi-supervision training procedure is conducted to keep the consistency internally between two different models from bottom to top. To this end, we introduce two kinds of cross-structure metric strategy to compute the mismatch value in different features level, together with entropy classification loss, integrated to be a three-stage supervision method. We additionally design a new network to learn from the supervisors, which have been trained on large-scale datasets. We fine-tune supervision model and train our new model on UCF101 and HMDB51 datasets, experiment results demonstrate the feasibility of soft transferring method, extend transfer learning to a broader sense, and show the flexibility of deploying of existing models. Our method is designed easily generalized to different networks in other computer vision task.

Full Text