Abstract

As virtual reality technology develops, the analysis and processing of video content have become hot spots in the field of computer vision. Video Action Detection aims to locate features in network video, and its research spans many fields, such as computer vision and spatial prediction. In view of the problem of low-efficiency classification models and inaccurate localization of small-scale targets in complex scenes, we propose a novel method to generate candidate intervals for action detection. The action recognition model is adopted to generate the action score sequence on the video time series. We also propose the uncertainty model of the descending pose detection algorithm. The pre-reaction phase generates a candidate list in the form of concatenated videos containing exactly the same pose to detect action poses that are not identical and of non-maximum duration. Experiments with traditional target detection and multiple deep learning models show that the proposed Non-Maximum Suppression algorithm has a strong ability to extract neural network features. Furthermore, compared with traditional ATSS and Faster R-CNN methods, the detection quality and performance are improved by more than 15.2% and 7.8%, respectively. Our method can fully utilize perception information to improve the quality of decision planning and plays a connecting role between perception fusion and decision planning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call