Abstract
Currently, the non-maximum suppression (NMS) algorithm is a commonly used method in the post-processing stage of object detection. However, the NMS algorithm cannot effectively eliminate missing and false object detection results because of the simple constraint condition. To solve the problem of the poor detection effect in highly overlapping dense object scenes in the traditional NMS algorithm, we design an RGB-D object detection network model based on the YOLO v3 framework, and using level-by-level metaphase fusion on the RGB and depth information, we propose an improved NMS algorithm which fuses depth characteristics. According to the depth of the object in the detection boxes, it is determined whether another object is the same object in highly overlapping detection boxes, and the average depth of the internal pixels in the detection boxes is calculated as a penalty term, then the penalty term is added to the detection box score to obtain a new constraint condition for non-maximum suppression. The experimental results on the NYU Depth V2 dataset show that the mean average precision (mAP) of the Depth Fusion NMS algorithm proposed in this paper is 0.8%, 0.5% and 0.3% higher than those of the Greedy-NMS, Soft NMS-L and Soft NMS-G methods, respectively. After comparison and analysis, our method can not only detect more overlapping objects but also achieve a better object localization accuracy.
Highlights
Object detection is an important research direction in the field of computer vision
Object detection algorithms based on convolutional neural networks can be divided into three steps [2]: feature learning and object extraction, object classification and location regression, and non-maximum suppression algorithms to select the optimal detection boxes
This paper aims to improve the Non-maximum suppression (NMS) algorithm in a double-channel RGB-D convolutional neural network by using object depth characteristics, effectively reducing the localization error of the detection box and decreasing the missing detection rate of highly overlapping intensive objects, thereby, improving the accuracy of the detection model
Summary
Object detection is an important research direction in the field of computer vision. The process can be understood as visual algorithm giving the computer a human-like visual recognition ability to identify object categories and obtain the object location information in scenes through an image obtained by a sensor. Object detection algorithms based on convolutional neural networks can be divided into three steps [2]: feature learning and object extraction, object classification and location regression, and non-maximum suppression algorithms to select the optimal detection boxes. In view of the above problems, this paper improves the NMS algorithm for RGB-D object detection, adjusts the detection box score by using the depth characteristics of different objects, and obtains the optimal detection boxes for each object, thereby effectively reducing the false and missing detection rate of the detection model. We applied the improved NMS algorithm in the current, popular detection framework YOLO v3 [10], and the network model was trained and tested in the challenging RGB-D dataset NYU Depth V2 [11], we obtained a high mean average precision (mAP)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.