Abstract
Early actionrecognition, i.e., recognizing an action before it is fully performed, is a challenging and important task. Existing works mainly focus on deterministic early action recognition outputting only a single class, and ignore the uncertainty and diversity that essentially exist in this task. Intuitively, when only the early portion of the action is observed, there could be multiple possibilities of the full action, as diversified actions can share almost identical early segments in many scenarios. Thus taking uncertainties and diversities into account, and outputting multiple plausible predictions, instead of a single one, can be important for the sake of authenticity and requirement of many practical applications. To this end, we propose a novel Diversified Early Action Recognition Network (Dear-Net) that is capable of outputting multiple reasonable action classes for each partial sequence by utilizing mode conversion. Specifically, we introduce an effective action diversity learning strategy to drive our network towards predicting diverse and reasonable results, in which each learnable action class is matched with the most suitable mode. Meanwhile, the collapsed modes which fail to receive any action class, are also considered in this strategy in order to ensure diversity. Moreover, we design a sequence decoder within our network to capture latent global information for better early action recognition. It provides a feasible scheme for weakly-supervised setting in which the Dear-Net leverages unlabelled data to improve performance. Experimental results on three challenging datasets clearly show the effectiveness of our approach.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.