Abstract
Video action recognition is a technique for automatically determining the category of a video action. It is necessary to design an efficient video action recognition algorithm to predict video labels. This work proposes a video action recognition model based on dual-stream information fusion with attention mechanisms (DSIFAM), which consists of three different sub-modules. First, this proposes an improved keyframe extraction method (IKFE). Based on K-means clustering, this uses convolutional features to calculate the similarity between video frames instead of pixel points. After obtaining preliminary clustering results, the method performs secondary optimization to obtain more representative keyframes. Second, this proposes a video action recognition model based on dual-stream information fusion (DSIF). The method introduces ConvLSTM in the spatial stream and uses P3D instead of the original convolutional network in the temporal stream, which can better extract spatial-temporal information and improve the classification performance. Third, this designs a multi-scale attention mechanism (MSAM) to enhance the feature extraction stage and obtain higher quality classification features. The resulting features are more prominent and have stronger representation capabilities. Finally, this work conducts systematic experiments on different datasets, the results verify the superiority of DSIFAM for video action recognition.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.