Abstract

In order to solve the problem of action recognition in short video and capture the key information of video, this paper first proposes a KGAF-means method for key frame extraction. The KGAF-means method is based on the clustering principle and combines the K-means algorithm with the artificial fish swarm algorithm to realize the key frame sequence extraction. Based on the extracted key frame sequence, the RGB image and the optical flow image are separately extracted by the improved dual-stream variable convolution network. Then, using the cascading method, the image feature vector and the optical flow feature vector are fused to obtain the fused feature vector for action recognition. The selected data set is the Charades data set. The experimental results show that the mAP value of the method is 22.9 on the public dataset Charades. And the results show that the proposed method has better robustness than other network models and improves the short video action recognition effect.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call