Abstract

The booming development of remote sensing images in many visual tasks has led to an increasing demand for obtaining images with more precise details. However, it is impractical to directly supply images that are simultaneously rich in spatial, spectral, and temporal information. One feasible solution is to fuse the information from multiple images. Since deep learning has achieved impressive achievements in image processing recently, this paper aims to provide a comprehensive review of deep learning-based methods for fusing remote sensing images at pixel-level. Specifically, we first introduce some traditional methods with their main limitations. Meanwhile, a brief presentation is made on four basic deep learning models commonly used in the field. On this basis, the research progress of these models in spatial information fusion and spatio-temporal fusion are reviewed. The current status on these models is further discussed with some coarse quantitative comparisons using several image quality metrics. After that, we find that deep learning models have not achieved overwhelming superiority over traditional methods but show great potential, especially the generative adversarial networks with its great capabilities in image generation and unsupervised learning should become a hot topic for future research. The joint use of different models should also be considered to fully extract multi-modal information. In addition, there is a lack of valuable research on pixel-level fusion of radar and optical images, requiring more attention in future work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call