Abstract

The task of human motion recognition based on video is widely concerned, and its research results have been widely used in intelligent human-computer interaction, virtual reality, intelligent monitoring, security, multimedia content analysis, etc. The purpose of this study is to explore the human action recognition in the football scene combined with learning quality related multimodal features. The method used in this study is to select BN-Inception as the underlying feature extraction network and use uncontrolled environment and real world to capture datasets UCFl01 and HMDB51, and pretraining is carried out on the ImageNet dataset. The spatial depth convolution network takes image frame as input, and the temporal depth convolution network takes stacked optical flow as input to carry out human action multimodal identification. In the results of multimodal feature fusion, the accuracy of UCFl01 dataset is generally high, all of which are over 80%, and the highest is 95.2%, while the accuracy of HMDB51 dataset is about 70%, and the lowest is only 56.3%. It can be concluded that the method of this study has higher accuracy and better effect in multimodal feature acquisition, and the accuracy of single-mode feature recognition is significantly lower than that of multimodal feature recognition. It provides an effective method for the multimodal feature of human motion recognition in the scene of football or sports.

Highlights

  • In addition to the detection, recognition, and tracking of moving objects, action analysis and understanding are included in the action recognition of people in the football scene. is action analysis realizes the interaction of a person and another and that of an object and a person in the football scene

  • Liu et al thought that human behavior recognition is an active research field in the field of computer vision and machine learning. ey proposed a large number of algorithms, most of which are designed for the subsets of four learning problems

  • Sivarathinabala et al believed that multimodal biometrics improve security by protecting the system from spoofing attacks. e system uses face and gait biometrics for authentication and recognition. ese videos are taken from two surveillance cameras, which are located in the frontoparallel and frontonormal views, respectively, as the input of the system. e gait system uses video from the frontoparallel view and uses the modeless method to extract the temporal and spatial motion summary of gait cycle. ey compared their gait characteristics by calculating the Euclidean distance between them. e face system uses the video from the frontonormal view and uses the appearance based method to extract features from the user’s face. ey compared the facial features by calculating the chi-squared dissimilarity between them

Read more

Summary

Introduction

In addition to the detection, recognition, and tracking of moving objects, action analysis and understanding are included in the action recognition of people in the football scene. is action analysis realizes the interaction of a person and another and that of an object and a person in the football scene. Ey proposed a large number of algorithms, most of which are designed for the subsets of four learning problems In these problems, the comparison between algorithms may be further limited by the variance within the dataset, experimental configuration, and other factors. Eir research introduces a new multimode, multiview, interactive dataset to evaluate human behavior recognition methods in four scenarios. For continuous facial expression recognition, they designed two spatiotemporal dense scale invariant feature transform features and combined them with multimodal features to recognize expressions from image sequences. For static facial expression recognition based on video frames, they extracted dense sift and some deep convolution neural network features, including their CNN structure. Is study first introduces the characteristics of football scene and the analysis of learning quality and, at the same time, summarizes the method and classification of human recognition in detail. It can be concluded that the method of this study has higher accuracy and better effect in multimodal feature acquisition

Multimodal Feature Learning Quality and Human Motion Recognition
Human Behavior Recognition Algorithm
Human Action Recognition Experiment with Multimodal Features
Findings
Parameter Setting
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.