HAFNet : A Heterogeneous Adaptive Fusion Network of Optical and SAR Imagery for Improved Land Use Classification
ABSTRACT The interpretation of high‐resolution remote sensing imagery is essential for accurate land use classification. Optical and synthetic aperture radar (SAR) imagery exhibit complementary characteristics. Their fusion offers an effective approach to mitigating the limitations of single‐modal data and improving classification performance. However, the modal heterogeneity and complexity of optical and SAR imagery pose significant challenges for effective fusion. To address these issues, we propose a heterogeneous adaptive fusion network (HAFNet). First, the multi‐modality feature extractor leverages HRNet to retain the spatial details and local textures of optical images, while a lightweight SparseDense Transformer captures the global structural patterns of SAR data. Within M2FE, the multi‐branch integrated feature enhancement further strengthens the extracted features by emphasizing essential semantic information and suppressing noise. Second, the adaptive multi‐scale attention fusion module employs multi‐scale channel and spatial attention to capture critical information from both modalities, and incorporates a gating mechanism to adjust fusion weights, thereby dynamically exploiting cross‐modal complementarity. Finally, the U‐shaped framework with skip connections enhanced by a dynamic channel fusion module restores spatial resolution and improves the recognition accuracy of small‐scale land cover categories. To validate HAFNet, we conducted extensive comparisons with state‐of‐the‐art models on three public datasets with different resolutions. Experimental results demonstrate that HAFNet achieves improvements of approximately 1.2% in overall accuracy, 1.3% in the Kappa coefficient, 1.4% in F1‐score, and 1.6% in mean Intersection over Union, confirming its effectiveness in land use classification.
- Research Article
14
- 10.1109/jstars.2013.2249656
- Oct 1, 2013
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
A new stereoscopic road network extraction framework based on the decision-level fusion of optical and Synthetic Aperture Radar (SAR) imagery is proposed in this paper. Three steps are included in this framework: 1) road segment extraction and structure optimization through SAR imagery, 2) road segment extraction and stereoscopic information collection through optical imagery, and 3) fusion of the SAR result with the optical image and the stereoscopic information. In this study, our new road network grouping algorithm called road network grouping based on the multi-scale geometric analysis of detector Response is used, with the improved footprint method, and the stereoscopic inversion algorithm. The most important finding of our work lies in the fusion step, by which a stereoscopic road network can be acquired after going through the three aforementioned processes and by fusing the stereoscopic information obtained from optical imagery and road network extracted from SAR imagery. Our algorithm is tested on the real TerraSAR-X and QuickBird data.
- Research Article
4
- 10.1016/j.knosys.2024.112387
- Aug 17, 2024
- Knowledge-Based Systems
BCLNet: Boundary contrastive learning with gated attention feature fusion and multi-branch spatial-channel reconstruction for land use classification
- Research Article
5
- 10.3390/rs14215316
- Oct 24, 2022
- Remote Sensing
Continuous and accurate acquisitions of surface water distribution are important for water resources evaluation, especially high-precision flood monitoring. During surface water extraction, optical imagery is strongly affected by clouds, while synthetic aperture radar (SAR) imagery is easily influenced by numerous physical factors; thus, the water extraction method based on single-sensor imagery cannot obtain high-precision water range under multiple scenarios. Here, we integrated the radar backscattering coefficient of ground objects into the Normalized Difference Water Index to construct a novel SAR and Optical Imagery Water Index (SOWI), and the water ranges of five study areas were extracted. We compared two previous automatic extraction methods based on single-sensor imagery and evaluated the accuracy of the extraction results. Compared with using optical and SAR imagery alone, the accuracy of all five regions was improved by up to 1–18%. The fusion-derived products resulted in user accuracies ranging 95–99% and Kappa coefficients varying by 85–97%. SOWI was then applied to monitor the 2021 heavy rainfall-induced Henan Province flood disaster, obtaining a time-series change diagram of flood inundation range. Our results verify SOWI’s continuous high-precision monitoring capability to accurately identify waterbodies beneath clouds and algal blooms. By reducing random noise, the defects of SAR are improved and the roughness of water boundaries is overcome. SOWI is suitable for high-precision water extraction in myriad scenarios, and has great potential for use in flood disaster monitoring and water resources statistics.
- Conference Article
3
- 10.1109/igarss47720.2021.9553751
- Jul 11, 2021
Synthetic Aperture Radar (SAR) imagery have been one of the important tools to support earth observations and topographic measurements. It means SAR imagery are essentially rich in structures and some important target categories are difficult to recognize. Optical imagery contains rich and clear spectral information which has a good influence on semantic image segmentation. The success of deep neural networks for semantic segmentation heavily depends on large-scale and well-labeled data sets, which are hard to collect in practice. In this paper, we consider deep transfer learning for semantic segmentation, we propose a deep novel transfer learning method, which transfers a semantically segment model from SAR imagery to SAR and optical fusion imagery. The experimental results show that the method proposed achieves higher mean Intersection over Union (mIoU) with less training time compared with other methods.
- Conference Article
2
- 10.1109/eorsa.2012.6261162
- Jun 1, 2012
Data fusion technique is an efficient way to benefit multi-source, multi-platform, and multi-angle remotely sensed information. Optical imagery and SAR (synthetic aperture radar) data are complementary in terms of capability of data acquisition and image characteristics. With their different capability and their unique information content respectively, fusion of high resolution SAR and optical multi-spectral imagery can improve the classification accuracy in land use. Texture information plays an important role for class discrimination especially in SAR imagery for its backscatter is sensitive to the type, orientation, homogeneity and spatial relationship of ground objects. In order to take full advantage of multi-source remotely sensed data and combine different features of them, this paper put forward a data fusion method for high spatial resolution remotely sensed data based on texture analysis. Texture features of high resolution SAR imagery were extracted using GLCM (Grey Level Co-occurrence Matrix) method. The texture features were detected in 0°, 45°, 90° and 135° four directions, and the moving window size of 3×3, 5×5, to 31×31, 41×41, 51×51, and 61×61 were tested to analyze the influences among them. The selected texture features were added with SAR data to make classification next. Both the two imagery were classified using an object-based and rule-based approach. Then, a decision level fusion was implemented and the accuracy of classification result was improved from 78.7% and 83.0% to 88.8%.
- Research Article
8
- 10.3390/rs17071298
- Apr 5, 2025
- Remote Sensing
Land use and land cover (LULC) classification through remote sensing imagery serves as a cornerstone for environmental monitoring, resource management, and evidence-based urban planning. While Synthetic Aperture Radar (SAR) and optical sensors individually capture distinct aspects of Earth’s surface, their complementary nature SAR excelling in structural and all-weather observation and optical sensors providing rich spectral information—offers untapped potential for improving classification robustness. However, the intrinsic differences in their imaging mechanisms (e.g., SAR’s coherent scattering versus optical’s reflectance properties) pose significant challenges in achieving effective multimodal fusion for LULC analysis. To address this gap, we propose a multimodal deep-learning framework that systematically integrates SAR and optical imagery. Our approach employs a dual-branch neural network, with two fusion paradigms being rigorously compared: the Early Fusion strategy and the Late Fusion strategy. Experiments on the SEN12MS dataset—a benchmark containing globally diverse land cover categories—demonstrate the framework’s efficacy. Our Early Fusion strategy achieved 88% accuracy (F1 score: 87%), outperforming the Late Fusion approach (84% accuracy, F1 score: 82%). The results indicate that optical data provide detailed spectral signatures useful for identifying vegetation, water bodies, and urban areas, whereas SAR data contribute valuable texture and structural details. Early Fusion’s superiority stems from synergistic low-level feature extraction, capturing cross-modal correlations lost in late-stage fusion. Compared to state-of-the-art baselines, our proposed methods show a significant improvement in classification accuracy, demonstrating that multimodal fusion mitigates single-sensor limitations (e.g., optical cloud obstruction and SAR speckle noise). This study advances remote sensing technology by providing a precise and effective method for LULC classification.
- Conference Article
- 10.1117/12.2321305
- Sep 18, 2018
Normalized Difference Vegetation Index (NDVI) values extracted from remotely sensed optical imagery are used ubiquitously to monitor crop condition. However, challenges in the operational use of optical imagery are well documented making it difficult to capture measures of crop condition during critical phenology stages when clouds obscure. This study investigates the integration of Synthetic Aperture Radar (SAR) and optical imagery to characterize the condition of crop canopies in order to deliver daily measures of NDVI during the entire growing season. Multitemporal C-band polarimetric RADARSAT-2 SAR data and RapidEye images were acquired in 2012 for a study site in western Canada. SAR polarimetric parameters and NDVI were extracted. The temporal variations in SAR polarimetric parameters and NDVI were interpreted with respect to the development of the canola canopy. Optical NDVI was statistically related with SAR polarimetric parameters over test canola fields. Significant correlations were documented between a number of SAR polarimetric parameters and optical NDVI, in particular with respect to HV backscatter, span, volume scattering of the Freeman Durden decomposition and the radar vegetation index, with R-values of 0.83, 0.72, 0.81 and 0.71 respectively. Based on the statistical analysis, SAR polarimetric parameters were calibrated to optical NDVI, creating a SAR-calibrated NDVI (SARc-NDVI)). A canopy structure dynamics model (CSDM) was fitted to the SARc-NDVI, providing a seasonal temporal vegetation index curve. The coupling of NDVI from optical and SAR imagery with a CSDM demonstrates the potential to derive daily measures of crop condition over the entire growing season.
- Research Article
3
- 10.3390/s25020329
- Jan 8, 2025
- Sensors (Basel, Switzerland)
The fusion of synthetic aperture radar (SAR) and optical satellite imagery poses significant challenges for ship detection due to the distinct characteristics and noise profiles of each modality. Optical imagery provides high-resolution information but struggles in adverse weather and low-light conditions, reducing its reliability for maritime applications. In contrast, SAR imagery excels in these scenarios but is prone to noise and clutter, complicating vessel detection. Existing research on SAR and optical image fusion often fails to effectively leverage their complementary strengths, resulting in suboptimal detection outcomes. This research presents a novel fusion framework designed to enhance ship detection by integrating SAR and optical imagery. This framework incorporates a detection system for optical images that utilizes Contrast Limited Adaptive Histogram Equalization (CLAHE) in combination with the YOLOv7 model to improve accuracy and processing speed. For SAR images, a customized Detection Transformer model, SAR-EDT, integrates advanced denoising algorithms and optimized pooling configurations. A fusion module evaluates the overlaps of detected bounding boxes based on intersection over union (IoU) metrics. Fused detections are generated by averaging confidence scores and recalculating bounding box dimensions, followed by robust postprocessing to eliminate duplicates. The proposed framework significantly improves ship detection accuracy across various scenarios.
- Research Article
16
- 10.5194/isprs-archives-xliii-b1-2020-91-2020
- Aug 6, 2020
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Abstract. Using space-borne remote sensing data is widely used for land-cover classification (LCC) due to its ability to provide a big amount of data with a regular temporal revisit time. In recent years, optical and synthetic aperture radar (SAR) imagery have become available for free, and their integration in time series have improved LCC. This research evaluates the classification accuracy using multitemporal (MT) Sentinel-1 (S1) and Sentinel-2 (S2) imagery. Pixel-based LCC is made for S1 and S2 imagery, and for a combination of both datasets with Random Forest (RF) and Extreme Gradient Boosting (XGBoost; XGB). The extent of the study area, is located in the south-east of France, in Lyon. Regardless of LCC using single-date or MT data, the highest classification results were achieved with integrated S1 and S2 imagery and XGB method, whereas overall accuracy (OA) and Kappa coefficient (Kappa) increased from 85.51% to 91.09%, and from 0.81 to 0.88, respectively. Furthermore, the integration of MT imagery significantly improved the classification of urban areas and reduced misclassification between forest and low vegetation. In this paper, in terms of the pixel-based classification, XGB produced slightly better results than RF, and outperformed it in terms of computational time. This research improved LCC with integration of radar and optical MT imagery, which can be useful for areas hampered by a frequent cloud cover. Future work should use the aforementioned data for specific applications in remote sensing, as well as evaluate the classification performance with different approaches, such as neural networks or deep learning.
- Research Article
- 10.3390/rs16152802
- Jul 31, 2024
- Remote Sensing
Land cover classification of Synthetic Aperture Radar (SAR) imagery is a significant research direction in SAR image interpretation. However, due to the unique imaging methodology of SAR, interpreting SAR images presents numerous challenges, and land cover classification using SAR imagery often lacks innovative features. Distributed scatterers interferometric synthetic aperture radar (DS-InSAR), a common technique for deformation extraction, generates several intermediate parameters during its processing, which have a close relationship with land features. Therefore, this paper utilizes the coherence matrix, the number of statistically homogeneous pixels (SHPs), and ensemble coherence, which are involved in DS-InSAR as classification features, combined with the backscatter intensity of multi-temporal SAR imagery, to explore the impact of these features on the discernibility of land objects in SAR images. The results indicate that the adopted features improve the accuracy of land cover classification. SHPs and ensemble coherence demonstrate significant importance in distinguishing land features, proving that these proposed features can serve as new attributes for land cover classification in SAR imagery.
- Research Article
324
- 10.1016/j.isprsjprs.2008.07.006
- Oct 2, 2008
- ISPRS Journal of Photogrammetry and Remote Sensing
Integration of optical and Synthetic Aperture Radar (SAR) imagery for delivering operational annual crop inventories
- Research Article
2
- 10.3390/rs16132459
- Jul 4, 2024
- Remote Sensing
Optical and Synthetic Aperture Radar (SAR) imagery offers a wealth of complementary information on a given target, attributable to the distinct imaging modalities of each component image type. Thus, multimodal remote sensing data have been widely used to improve land cover classification. However, fully integrating optical and SAR image data is not straightforward due to the distinct distributions of their features. To this end, we propose a land cover classification network based on multimodal feature fusion, i.e., MFFnet. We adopt a dual-stream network to extract features from SAR and optical images, where a ResNet network is utilized to extract deep features from optical images and PidiNet is employed to extract edge features from SAR. Simultaneously, the iAFF feature fusion module is used to facilitate data interactions between multimodal data for both low- and high-level features. Additionally, to enhance global feature dependency, the ASPP module is employed to handle the interactions between high-level features. The processed high-level features extracted from the dual-stream encoder are fused with low-level features and inputted into the decoder to restore the dimensional feature maps, generating predicted images. Comprehensive evaluations demonstrate that MFFnet achieves excellent performance in both qualitative and quantitative assessments on the WHU-OPT-SAR dataset. Compared to the suboptimal results, our method improves the OA and Kappa metrics by 7.7% and 11.26% on the WHU-OPT-SAR dataset, respectively.
- Research Article
2
- 10.1007/bf02997072
- Dec 1, 1990
- Journal of the Indian Society of Remote Sensing
The paper describes the details of a comparative study of geological interpretations carried out from Synthetic Aperture Radar (SAR) imagery, Landsat MSS (B & W) imagery and Aerial Photographs, covering 2100 sq km of area in Anantapur district of Andhra Pradesh. The area comprises Peninsular—Gneissic Complex and rocks of Dharwar and Cuddapah Super Groups beside the Quaternary alluvial deposits along the Penneru river and its tributaries. Geomorphologically the areas is represented by denudational, fluvial and structural landforms. The study indicates that the details of the geological and geomorphological maps prepared from SAR imagery and aerial photographs are comparable despite the smaller scale of SAR imagery while the same are not exhibited in Landsat imagery mainly due to its low resolution. Although broad lithological units are possible to be discriminated on SAR as well as aerial photographs, some of the finer rock types viz. gabbroic dykes could be discriminated from the delerite dykes in the SAR imagery due to their different surface roughness. Stereoscopic coverage and enhanced micro-relief of SAR imagery gives better geomorphological details in comparison to aerial photographs. A detailed study of lineaments has also been carried out which shows that in SAR imagery there is over-representation of short lineaments due to enhanced micro-relief and radarshadow effects across the look direction and under-representation of lineaments along the look direction. Landsat imagery is perhaps the best for demarcating lineaments of regional magnitude while aerial photographs are good for depicting shorter lineaments. However, certain lineaments seen in SAR imagery are often not continuously seen on aerial photographs.
- Research Article
7
- 10.3390/rs13030491
- Jan 30, 2021
- Remote Sensing
Numerous earth observation data obtained from different platforms have been widely used in various fields, and geometric calibration is a fundamental step for these applications. Traditional calibration methods are developed based on the rational function model (RFM), which is produced by image vendors as a substitution of the rigorous sensor model (RSM). Generally, the fitting accuracy of the RFM is much higher than 1 pixel, whereas the result decreases to several pixels in mountainous areas, especially for Synthetic Aperture Radar (SAR) imagery. Therefore, this paper proposes a new combined adjustment for geolocation accuracy improvement of multiple sources satellite SAR and optical imagery. Tie points are extracted based on a robust image matching algorithm, and relationships between the parameters of the range-doppler (RD) model and the RFM are developed by transformed into the same Geodetic Coordinate systems. At the same time, a heterogeneous weight strategy is designed for better convergence. Experimental results indicate that our proposed model can achieve much higher geolocation accuracy with approximately 2.60 pixels in the X direction and 3.50 pixels in the Y direction. Compared with traditional methods developed based on RFM, our proposed model provides a new way for synergistic use of multiple sources remote sensing data.
- Research Article
20
- 10.1016/j.rse.2024.114373
- Aug 31, 2024
- Remote Sensing of Environment
Flood inundation monitoring using multi-source satellite imagery: a knowledge transfer strategy for heterogeneous image change detection
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.