Abstract

The emergence of the convolutional neural network greatly improves the accuracy of human action recognition. However, with the deepening of the network, fewer and fewer features are extracted, and in some datasets, due to the shooting angle, the size of the target to be recognized is different. To solve this problem, on the basis of resnext human action recognition method, we propose an improved resnext human action recognition method based on multi-feature map fusion. First, the video is uniformly sampled to generate training samples, and we generate samples with different frames as the input to the network. Second, we add n layers of up-sampling layers after layer 1 of resnext, to enlarge the feature maps and extract multiple feature maps, so that the extracted feature maps are clearer, and small targets can be better recognized. Finally, for the n results obtained, we use the weighted geometric means combination forecasting method based on L_1 norm to fuse and obtain the final result. In the process of experiment, using UCF-101 and HMDB-51 for verification, the accuracy of our model is 90.3% on UCF-101, which is higher than most of the state-of-art algorithms.

Highlights

  • Due to the potential applications of human action recognition in video surveillance, behavior analysis, video retrieval, and other fields, human action recognition has become a very important field in computer vision research [1]

  • Inspired by FPN [22], we propose a multi-scale fusion method for human action recognition; besides, by observing the datasets, we found that the background information of some actions is complex, and the targets we want to recognize are small relative to the entire background, so we use the up-sampling method to enlarge the feature maps to make small targets clearer and easier to detect

  • On the basis of resnext, we propose the method of Human Action Recognition Algorithm Based on Multi-feature Map Fusion, adding n layers of up-sampling layers after layer1 to train separately, which aims to enlarge the feature maps to make the extracted features clearer

Read more

Summary

INTRODUCTION

Due to the potential applications of human action recognition in video surveillance, behavior analysis, video retrieval, and other fields, human action recognition has become a very important field in computer vision research [1]. On the basis of resnext, we propose the method of Human Action Recognition Algorithm Based on Multi-feature Map Fusion, adding n layers of up-sampling layers after layer to train separately, which aims to enlarge the feature maps to make the extracted features clearer. Convolution, many features have been filtered out, so on the basis of the resnext-101, we proposed a new architecture named Human Action Recognition Algorithm Based on Multi-feature Map Fusion. Several up-sampling layers [25] to extract more feature maps; (2) Several groups of results obtained are fused using the weighted geometric means combination forecasting method based on L_1 norm to get the final result. HUMAN ACTION RECOGNITION ALGORITHM BASED ON MULTI-FEATURE MAP FUSION A.

FEATURE MAPS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.