Abstract

Abstract. Object detection performance is directly related to the apparent size of the object to be detected, thus most state-of-the-art algorithms dedicate different detection heads for each object size. In this work, we propose an end-to-end pipeline to adapt a single-shot object detector (SSD) to the underlying object size distribution of the target detection domain. Our contributions are the adjustments to the detector architecture and the introduction of a novel batch sampling method. To validate the effect of our method, we chose a task-specific highly specialized object detection and classification dataset of tomato fruits that apart from bounding box information, it also contains class information for three ripening stages of each tomato fruit.More specifically, the major motivation and contributions are discussed in relation to the recent bibliography. Next, an extensive analysis of our pipeline is presented, where the concept of scale alignment is thoroughly presented along with the novel sampling method. Following the results of a series of experiments, we conclude that our pipeline significantly improves over the “off-the-shelf” base single-shot detector and its detection performance is comparable to more elaborate algorithms, especially if we measure detection performance slightly disregarding box localization. Lastly, we include a stratified ablation study in the closing sections where we measure the impact of each step along our proposed SSD adaptation pipeline.

Highlights

  • Object Detection (OD) is one of the most challenging problems in computer vision, aiming to determine the location of certain objects on images and videos, as well as to classify them among specific classes

  • Our motivation lies in the need for a higher-level universal approach that could potentially benefit a handful of target detection model architectures into significantly improving their respective detection performance

  • This work introduces two major contributions: firstly, we present a straightforward pipeline to configure an object detection model so that it can perform optimally in the size range of a given target object; secondly, we introduce a novel training strategy which significantly improves the detection performance of a model avoiding the complexity of other similar methods

Read more

Summary

Introduction

Object Detection (OD) is one of the most challenging problems in computer vision, aiming to determine the location of certain objects on images and videos, as well as to classify them among specific classes. OD models are tied to standard datasets used for pretraining, such as the COCO (Lin et al, 2014) or the OpenImages (Kuznetsova et al, 2020) dataset, which affects their response to object sizes different than the ones in the dataset. As discussed in the following sub-section, are based on a variety of different approaches that tackle the problem of multi-scale object detection with remarkable efficiency. Most of these approaches are either new architectures all together, or heavily tied to a specific type of detection model.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.