Abstract

In the analysis and processing of massive surveillance videos, target behavior recognition is an important task. Most researchers pay more attention to the lightweight of convolution operators in intelligent recognition systems or increase the complexity of lightweight modules, but lack of lightweight research on point-by-point convolution modules which occupy a large number of parameters and computation. For this reason, this article carries out the research on intelligent recognition of key frame target behavior in video surveillance based on lightweight convolution neural network. The three-dimensional position information of bone joints is extracted as the target behavior feature. Based on local vector aggregation descriptor, it makes a more compact representation of key frames of the surveillance video, and gives the generation process of local vector aggregation descriptor. After the structured pruning of the filter, the memory occupation of the processed network model is significantly reduced, and the lightweight of the model is realized. Experimental results verify the effectiveness of the model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.