Abstract

Video object recognition for UAV ground detection is widely used in target search, daily patrol, environmental reconnaissance, and other fields. So, we propose the novel parallel deep learning network with the ability of the global and local joint feature extraction for the UAV video target detection. This paper focuses on solving the problems of feature extraction and target background discrimination required by target discovery to realize target discovery. Break through the key problems of real-time target recognition, such as multiscale targets, high background complexity, many small targets, dense target arrangement, and multidirection, and put forward an optimized network scheme, aiming at the problem of multiscale of image target and aiming at the problem of large change of target scale in image. In the network, the corresponding targets with different sizes and different aspect ratios are matched to make the different targets match the closest, and then, the position of the detection box is fine-tuned by regression. For the special problem of image viewing angle and for the rotation invariance of the airborne down looking image of the target, the usual solution is through data enhancement; that is, through the rotation transformation of the training data, the neural network can learn the rotation invariance of the target. Aiming at the problem of multi-directional image target and aiming at the problems of large target aspect ratio, large target tilt angle, and changeable direction in the target, we propose to use the tilt detection frame instead of the ordinary rectangular detection frame. Aiming at the problem of dense arrangement of image targets and aiming at a large number of densely arranged targets in the image, a feature refining module is proposed, which can effectively improve the detection performance of the detector for densely arranged targets. The experimental results shows that the proposed algorithm achieves more than 10% on the target detection accuracy with focal length change of 1-10 times. The detection accuracy meets the requirements of practical application.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call