Abstract
The ‘You Only Look Once’ v3 (YOLOv3) method is among the most widely used deep learning-based object detection methods. It uses the k-means cluster method to estimate the initial width and height of the predicted bounding boxes. With this method, the estimated width and height are sensitive to the initial cluster centers, and the processing of large-scale datasets is time-consuming. In order to address these problems, a new cluster method for estimating the initial width and height of the predicted bounding boxes has been developed. Firstly, it randomly selects a couple of width and height values as one initial cluster center separate from the width and height of the ground truth boxes. Secondly, it constructs Markov chains based on the selected initial cluster and uses the final points of every Markov chain as the other initial centers. In the construction of Markov chains, the intersection-over-union method is used to compute the distance between the selected initial clusters and each candidate point, instead of the square root method. Finally, this method can be used to continually update the cluster center with each new set of width and height values, which are only a part of the data selected from the datasets. Our simulation results show that the new method has faster convergence speed for initializing the width and height of the predicted bounding boxes and that it can select more representative initial widths and heights of the predicted bounding boxes. Our proposed method achieves better performance than the YOLOv3 method in terms of recall, mean average precision, and F1-score.
Highlights
Object detection is an important and challenging field in computer vision, one which has been the subject of extensive research [1]
The goal of object detection is to detect all objects and class the objects. It has been widely used in autonomous driving [2], pedestrian detection [3], medical imaging [4], industrial detection [5], robot vision [6], intelligent video surveillance [7], remote sensing images [8], etc
The complexity of k-means clustering method is expressed as O(nkd ) for the data based on d dimension and k cluster centers, whereby n is the number of data
Summary
Object detection is an important and challenging field in computer vision, one which has been the subject of extensive research [1]. The goal of object detection is to detect all objects and class the objects. It has been widely used in autonomous driving [2], pedestrian detection [3], medical imaging [4], industrial detection [5], robot vision [6], intelligent video surveillance [7], remote sensing images [8], etc. Compared with traditional detection algorithms, the deep learning-based object detection method based has better performance in terms of robustness, accuracy and speed for multi-classification tasks
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.