Abstract

Remote sensing targets have different dimensions, and they have the characteristics of dense distribution and a complex background. This makes remote sensing target detection difficult. With the aim at detecting remote sensing targets at different scales, a new You Only Look Once (YOLO)-V3-based model was proposed. YOLO-V3 is a new version of YOLO. Aiming at the defect of poor performance of YOLO-V3 in detecting remote sensing targets, we adopted DenseNet (Densely Connected Network) to enhance feature extraction capability. Moreover, the detection scales were increased to four based on the original YOLO-V3. The experiment on RSOD (Remote Sensing Object Detection) dataset and UCS-AOD (Dataset of Object Detection in Aerial Images) dataset showed that our approach performed better than Faster-RCNN, SSD (Single Shot Multibox Detector), YOLO-V3, and YOLO-V3 tiny in terms of accuracy. Compared with original YOLO-V3, the mAP (mean Average Precision) of our approach increased from 77.10% to 88.73% in the RSOD dataset. In particular, the mAP of detecting targets like aircrafts, which are mainly made up of small targets increased by 12.12%. In addition, the detection speed was not significantly reduced. Generally speaking, our approach achieved higher accuracy and gave considerations to real-time performance simultaneously for remote sensing target detection.

Highlights

  • Remote sensing images [1,2,3,4] have attracted more research in the field of computer version (CV) with the rapid development of satellite and imaging technology

  • Compared with the previous two algorithms, the remote sensing target detection algorithms based on deep learning have better accuracy and robustness because they no longer use the features of manual design

  • In order to verify the validity of our improved You Only Look Once (YOLO)-V3 for remote sensing target detection, we compared our approach with original YOLO-V3, YOLO-V3 tiny, and other state-of-the-art algorithms on remote sensing object detection (RSOD) and the USC-AOD dataset

Read more

Summary

Introduction

Remote sensing images [1,2,3,4] have attracted more research in the field of computer version (CV) with the rapid development of satellite and imaging technology. Instead of using the region proposed network (RPN), the one-stage algorithms obtain the predictive information of location and category directly They are called the regression-based algorithms and they can usually achieve higher detection speed than the two-stage ones. Weber et al [41] proposed a method of making use of image analysis to extract coastline templates and adopted this method to detect oil tanks Compared with the previous two algorithms, the remote sensing target detection algorithms based on deep learning have better accuracy and robustness because they no longer use the features of manual design. As an advanced target detection model, YOLO-V3 adopts a feature pyramid network (FPN) [46,47], ResNet (Residual Network) [48], and achieves good performance in speed and accuracy.

The Theory of YOLO
The Principle of YOLO
The Network of YOLO-V3
The Proposed Algorithm with Multi-Scale Detection
RES 1st
K-Means for Anchor Boxes
2: Calculate the distance between each ground truth and each cluster center:
Relative to the Grid Cell
The NMS Algorithm for Merging Bounding Boxes
2: If2:the
Experiment and Results
Loss Function
The Evaluation Indicators
Experiment on Remote Sensing Target Detection
Dataset Analysis
Experimental Results and Analysis in RSOD and UCS-AOD Dataset
Method
Ablation Experiments
The comparison of YOLO-V3 and our approach
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.