Abstract

Action prediction based on partially observed videos is challenging as the information provided by partial videos is not discriminative enough for classification. In this paper, we propose a Deep Residual Feature Learning (DeepRFL) framework to explore more discriminative information from partial videos, achieving similar representations as those of complete videos. The whole framework performs as a teacher-student network, where the teacher network supports the complete video feature supervision to the student network to capture the salient differences between partial videos and their corresponding complete videos based on the residual feature learning. The teacher and student network are trained simultaneously, and the technique called partial feature detach is employed to prevent the teacher network from disturbing by the student network. We also design a novel weighted loss function to give less penalization to partial videos that have small observation ratios. Extensive evaluations on the challenging UCF101 and HMDB51 datasets demonstrate that the proposed method outperforms state-of-the-art results without knowing the observation ratios of testing videos. The code will be publicly available soon.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call