Abstract
Remote sensing images contain various land surface scenes and different scales of ground objects, which greatly increases the difficulty of super-resolution tasks. The existing deep learning-based methods cannot solve this problem well. To achieve high-quality super-resolution of remote sensing images, a residual aggregation and split attentional fusion network (RASAF) is proposed in this article. It is mainly divided into the following three parts. First, a split attentional fusion block is proposed. It uses a basic split–fusion mechanism to achieve cross-channel feature group interaction, allowing the method to adapt to various land surface scene reconstructions. Second, to fully exploit multiscale image information, a hierarchical loss function is used. Third, residual learning is used to reduce the difficulty of training in super-resolution tasks. However, the respective residual branch features are used quite locally and fail to represent the real value. A residual aggregation mechanism is used to aggregate the local residual branch features to generate higher quality local residual branch features. The comparison of RASAF with some classical super-resolution methods using two widely used remote sensing datasets showed that the RASAF achieved better performance. And it achieves a good balance between performance and model parameter number. Meanwhile, the RASAF’s ability to support multilabel remote sensing image classification tasks demonstrates its usefulness.
Highlights
R EMOTE sensing images are commonly used in environmental monitoring, military, agriculture, and other fields due to their wide shooting range, free shooting time, and rich information
2) We introduced the residual aggregation mechanism into the remote sensing image super-resolution to aggregate the branches of residuals to generate better quality features and we defined a hierarchical loss to reduce the complexity of reconstruction
We proposed a residual aggregation and split attentional fusion network for super-resolution of remote sensing images with varying scenes and spatial complexity
Summary
R EMOTE sensing images are commonly used in environmental monitoring, military, agriculture, and other fields due to their wide shooting range, free shooting time, and rich information. The reconstruction-based method [3] uses efficient prior knowledge of LR and HR image pairs to minimize the solution space size These methods generate sharp texture details, but they suffer from computational complexity and performance degradation when the magnification increases. Some recent deep learning-based algorithms for remote sensing super-resolution images [12]–[15] have shown promising results. The residual features progressively aggregate various aspects of the input image as the network depth increases, which is useful for reconstructing the spatial information of the image. 2) We introduced the residual aggregation mechanism into the remote sensing image super-resolution to aggregate the branches of residuals to generate better quality features and we defined a hierarchical loss to reduce the complexity of reconstruction. 3) We conducted the experiments to compare the residual aggregation and split attentional fusion network (RASAF) with some state-of-the-art models using the UCMerced LandUse and PatternNet datasets in order to validate the improvement generated by the RASAF
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have