Abstract

Remote sensing images are often of low quality due to the limitations of the equipment, resulting in poor image accuracy, and it is extremely difficult to identify the target object when it is blurred or small. The main challenge is that objects in sensing images have very few pixels. Traditional convolutional networks are complicated to extract enough information through local convolution and are easily disturbed by noise points, so they are usually not ideal for classifying and diagnosing small targets. The current solution is to process the feature map information at multiple scales, but this method does not consider the supplementary effect of the context information of the feature map on the semantics. In this work, in order to enable CNNs to make full use of context information and improve its representation ability, we propose a residual attention function fusion method, which improves the representation ability of feature maps by fusing contextual feature map information of different scales, and then propose a spatial attention mechanism for global pixel point convolution response. This method compresses global pixels through convolution, weights the original feature map pixels, reduces noise interference, and improves the network’s ability to grasp global critical pixel information. In experiments, the remote sensing ship image recognition experiments on remote sensing image data sets show that the network structure can improve the performance of small-target detection. The results on cifar10 and cifar100 prove that the attention mechanism is universal and practical.

Highlights

  • At present, sensor information is a hot research target. e acquisition of sensor information and mobile computing are relatively mature [1]. ere have been many outstanding achievements in the research of basic data types of sensor information, and many more successful algorithms have been proposed [2, 3]

  • Information processing is faced with difficulties; because of the limitations of some devices, the pictures obtained by the sensor have the characteristics of large noise, small targets, and blurred targets

  • Experimental results show that the proposed model improves the accuracy of small-target recognition in remote sensing images

Read more

Summary

Introduction

Sensor information is a hot research target. e acquisition of sensor information and mobile computing are relatively mature [1]. ere have been many outstanding achievements in the research of basic data types of sensor information, and many more successful algorithms have been proposed [2, 3]. Multiscale deep convolutional neural network (MS-CNN) [18] extracts the proposal region from different-scale feature maps and uses the deconvolution replacement to sample the input image to improve the speed accuracy. Single-shot multibox detector (SSD) extends several additional convolution layers on the truncated Vgg16 [19] as its backbone network and sets different default frame sizes according to different receptive fields, so it can better predict targets of various scales. The use of low-level features is intentionally avoided, the shallow layer of a convolutional neural network cannot fully extract features, which still limits the performance of the detector in small-scale target detection. E detection network based on multiscale fusion features improves the detection accuracy by injecting large-scale context information. (iii) In the benchmark data sets cifar and cifar100, our GPC attention mechanism is better than the current attention mechanisms such as CBAM [22] and SENet [23] on the accuracy and achieves the best SOTA results

Related Work
Method
Experiments and Results
Residual
Ablation Experiment on cifar100
Design
Conflicts of Interest
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call