Abstract

In order to improve action recognition accuracy, the discriminative kinematic descriptor and deep attention-pooled descriptor are proposed. Firstly, the optical flow field is transformed into a set of kinematic fields with more discriminativeness. Subsequently, two kinematic features are constructed, which more accurately depict the dynamic characteristics of action subject from the multi-order divergence and curl fields. Secondly, by introducing both of the tight-loose constraint and anti-confusion constraint, a discriminative fusion method is proposed, which guarantees better within-class compactness and between-class separability, meanwhile reduces the confusion caused by outliers. Furthermore, a discriminative kinematic descriptor is constructed. Thirdly, a prediction-attentional pooling method is proposed, which accurately focuses its attention on the discriminative local regions. On this basis, a deep attention-pooled descriptor (DKD–DAD) is constructed. Finally, a novel framework with discriminative kinematic descriptor and deep attention-pooled descriptor is presented, which comprehensively obtains the discriminative dynamic and static information in a video. Consequently, accuracies are improved. Experiments on two challenging datasets verify the effectiveness of our methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.