Abstract
Addressing the challenge of efficiently detecting objects in ultra-high-resolution images during object detection tasks, this paper proposes a novel method called MegaDetectNet, which leverages foreground image for large-scale resolution image object detection. MegaDetectNet utilizes a foreground extraction network to generate a foreground image that highlights target regions, thus avoiding the computationally intensive process of dividing the image into multiple sub-images for detection, and significantly improving the efficiency of object detection. The foreground extraction network in MegaDetectNet is built upon the YOLOv5 model with modifications: the large object detection head and classifier are removed, and the PConv convolution is introduced to reconstruct the C3 module, thereby accelerating the convolution process and enhancing foreground extraction efficiency. Furthermore, a Res2Rep convolutional structure is developed to enlarge the receptive field and improve the accuracy of foreground extraction. Finally, a foreground image construction method is proposed, fusing and stitching foreground target regions into a unified foreground image. This approach replaces multiple divided sub-images with a single foreground image for detection, reducing overhead time. The proposed MegaDetectNet method’s effectiveness for detecting objects in ultra-high-resolution images is validated using the publicly available DOTA dataset. Experimental results demonstrate that MegaDetectNet achieves an average time reduction of 83.8% compared to the sub-image division method among various commonly used object detectors, with only a marginal 8.7% decrease in mAP (mean Average Precision). This validates the practicality and efficacy of the MegaDetectNet method for object detection in ultra-high-resolution images.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.