Abstract

Action recognition is an important research direction in computer vision, which has worldwide applications, such as video surveillance, human-robot interaction and so on. Due to the influence of complex background and multi-angle changes, accurate recognition and analysis of human motion in real-life scenarios is still a challenging problem. In order to improve the accuracy of pedestrian detection and motion recognition, this paper proposes a novel edge-aware end-to-end deep network method, which uses the edge-aware pooling module to improve pedestrian contour accuracy and captures video sequences using multi-scale pyramid pooling layer spatial-time context feature. The complementary features of the edge-related features can effectively preserve the clear boundary, and the combination of the auxiliary side output and the pyramid pooling layer output can extract rich global context information. A large number of qualitative and quantitative experimental results show that the proposed model can effectively improve the performance of existing pedestrian detection and motion recognition networks on the UCF-101, HMDB-51, and KTH dataset.

Highlights

  • Digital twin and virtual/augmented reality have achieved rapid development in recent years, which has promoted the innovation of many traditional industries, such as manufacturing, construction, education and other fields

  • In order to evaluate the performance of the pedestrian-aware dense network pedestrian recognition algorithm proposed in this paper, the training set uses internationally accepted motion recognition data: KTH, UCF101 and HMDB5, among which the KTH database includes 6 types of motions performed by 25 different pedestrians in four scenarios

  • The pyramid pooling model is applied to the decoder part, and the edge-aware pooling module is integrated to generate the final motion recognition network

Read more

Summary

INTRODUCTION

Digital twin and virtual/augmented reality have achieved rapid development in recent years, which has promoted the innovation of many traditional industries, such as manufacturing, construction, education and other fields. Vision-based motion recognition technology still exist many problems, mainly due to the non-rigid features of the human body and the influence of complex background [4]. The dimensionality reduction coefficient is taken as the feature of the motion analysis and evaluation, and the SVM classifier is used to realize the classification and recognition of the model’s walk-show action The accuracy of this method is 71.9070 through cross-validation, and it preliminarily realizes the professional evaluation of model walk-show. The existing deep learning network structure only uses high-level features for image classification and recognition, which makes it difficult to distinguish the targets that need fine features to recognize, such as gesture categories, vehicle models, and so on. In order to enhance the generalization ability of the model, this paper proposes a loss function based on regularization constraints

OUR IMPROVED MOTION RECOGNITION ALGORITHM
EDGE-AWARE POOLING MODULE
MULTI-SCALE PYRAMID SUPERVISION MODULE
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.