Multimodal fusion network with learnable wavelet-enhanced features for hyperspectral unmixing

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

ABSTRACT Hyperspectral unmixing (HU) aims to obtain subpixel-level material composition information, which is crucial for the precise advancement of hyperspectral image processing techniques. In recent years, deep learning (DL) has been widely applied to HU due to its strong ability to capture complex and nonlinear feature relationships in the data. However, relying solely on hyperspectral images for unmixing often fails to effectively distinguish objects in complex scenes, particularly when different endmembers exhibit similar spectral characteristics. To address this limitation, we propose a Multimodal Fusion Network (MMFNet) that incorporates the elevation information inherent in light detection and ranging (LiDAR) data. MMFNet is capable of simultaneously extracting spectral features from hyperspectral images and spatial features from LiDAR data. Furthermore, most existing DL-based HU methods operate only in the original spectral domain, which makes them susceptible to spectral variability, noise, and limited discriminative capacity. To overcome these challenges, we integrate Learnable Wavelet Transform (LWT) into MMFNet to adaptively decompose hyperspectral signals into multiple frequency subdomains, thereby mitigating spectral variability and noise while preserving spatial consistency. In addition, a Multi-Scale Convolution Fusion Module (MSCFM) is designed to capture semantic information at different receptive fields and enhance the fine-grained fusion of spectral and spatial features. Through these designs, MMFNet produces more robust and discriminative feature representations, enabling better separation of spectrally similar endmembers. Extensive experiments demonstrate that the proposed MMFNet outperforms several state-of-the-art unmixing methods.

Similar Papers
  • Research Article
  • Cite Count Icon 118
  • 10.1016/j.ecolind.2016.10.001
Fusion of airborne LiDAR data and hyperspectral imagery for aboveground and belowground forest biomass estimation
  • Oct 17, 2016
  • Ecological Indicators
  • Shezhou Luo + 7 more

Fusion of airborne LiDAR data and hyperspectral imagery for aboveground and belowground forest biomass estimation

  • Research Article
  • Cite Count Icon 57
  • 10.1016/j.jag.2021.102363
A machine learning algorithm to detect pine wilt disease using UAV-based hyperspectral imagery and LiDAR data at the tree level
  • May 23, 2021
  • International Journal of Applied Earth Observation and Geoinformation
  • Run Yu + 5 more

A machine learning algorithm to detect pine wilt disease using UAV-based hyperspectral imagery and LiDAR data at the tree level

  • Research Article
  • Cite Count Icon 65
  • 10.1109/tgrs.2022.3155794
Multimodal Hyperspectral Unmixing: Insights From Attention Networks
  • Jan 1, 2022
  • IEEE Transactions on Geoscience and Remote Sensing
  • Zhu Han + 5 more

Deep learning (DL) has aroused wide attention in hyperspectral unmixing (HU) owing to its powerful feature representation ability. As a representative of unsupervised DL approaches, autoencoder (AE) has been proven to be effective to better capture nonlinear components of hyperspectral images than the traditional model-driven linearized methods. However, only using hyperspectral images for unmixing fails to distinguish objects in complex scene, especially for different endmembers with similar materials. To overcome this limitation, we propose a novel multimodal unmixing network for hyperspectral images, called MUNet, by considering the height differences of light detection and ranging (LiDAR) data in a squeeze-and-excitation (SE)-driven attention fashion to guide the unmixing process, yielding performance improvement. MUNet is capable of fusing multimodal information and using the attention map derived by LiDAR to aid network that focuses on more discriminative and meaningful spatial information regarding scenes. Moreover, attribute profile (AP) is adopted to extract the geometrical structures of different objects to better model the spatial information of LiDAR. Experimental results on synthetic and real datasets demonstrate the effectiveness and superiority of the proposed method compared with several state-of-the-art unmixing algorithms. The codes will be available at <uri>https://github.com/hanzhu97702/IEEE_TGRS_MUNet</uri>, contributing to the remote sensing community.

  • Research Article
  • Cite Count Icon 47
  • 10.1109/tgrs.2021.3108352
Deep Multimodal Fusion Network for Semantic Segmentation Using Remote Sensing Image and LiDAR Data
  • Jan 1, 2022
  • IEEE Transactions on Geoscience and Remote Sensing
  • Yangjie Sun + 4 more

Extracting semantic information from very-high-resolution (VHR) aerial images is a prominent topic in the Earth observation research. An increasing number of different sensor platforms are appearing in remote sensing, each of which can provide corresponding multimodal supplemental or enhanced information, such as optical images, light detection and ranging (LiDAR) point clouds, infrared images, or inertial measurement unit (IMU) data. However, these current deep networks for LiDAR and VHR images have not fully utilized the complete potential of multimodal data. The stacked multimodal fusion network (MFNet) ignores the structural differences between the modalities and the manual statistical characteristics within the modalities. For multimodal remote sensing data and its corresponding carefully designed handcrafted features, we designed a novel deep MFNet that can use multimodal VHR aerial images and LiDAR data and the corresponding intramodal features, such as LiDAR-derived features [slope and normalized digital surface model (NDSM)] and imagery-derived features [infrared–red–green (IRRG), normalized difference vegetation index (NDVI), and difference of Gaussian (DoG)]. Technically, we introduce the attention mechanism and multimodal learning to adaptively fuse intermodal and intramodal features. Specifically, we designed a multimodal fusion mechanism, pyramid dilation blocks, and a multilevel feature fusion module. Through these modules, our network realized the adaptive fusion of multimodal features, improved the receptive field, and enhanced the global-to-local contextual fusion effect. Moreover, we used a multiscale supervision training scheme to optimize the network. Extensive experimental results and ablation studies on the ISPRS semantic dataset and IEEE GRSS DFC Zeebrugge dataset show the effectiveness of our proposed MFNet.

  • Conference Article
  • 10.3997/2214-4609.20149959
Characterization of an analogue of fractured reservoir using LIDAR, GPR and conventional data
  • Jan 1, 2010
  • M Coll + 5 more

The Solvay quarry displays karstified and heavily fractured strata of peritidal platform carbonates of late Barremian age, that can serve as an analog to subsurface fractured reservoirs. In addition of being a potential analog, this study also aims to improve the methodology used in building of DOM (Digital Outcrop Model). The originality of the applied methodology is the integration of conventional outcrop analysis, LIDAR (Light Detection and Ranging) and GPR (Ground Penetrating Radar) data. The goal is to produce an accurate and efficient DOM that resolves the three-dimensional sub-seismic heterogeneity of the fracture distribution in the strata. Stratigraphic and fracture analysis with conventional methods was performed on about 2 km of exposed cliff faces that were subsequently scanned with the LIDAR equipment. Transversal and longitudinal 2D GPR lines and 6 GPR cubes were acquired on the quarry floor to correlate the quarry walls. The 2D GPR data were statically corrected using the GPS horizontal coordinates of the transects, high-resolution topography provided from LIDAR data, and a replacement velocity of 0.098m/ns. GPR and LIDAR data were loaded into 3D CAD software to interpret each horizon and to reconstruct the structural framework. To characterize the fracture distribution; scanline measures were performed along the quarry walls, 3D migrated GPR data was interpreted by delineating high amplitude zones originating from focused diffractions that define fracture surfaces (Grasmueck et al. 2005) and LIDAR point clouds were processed to reveal the main planes families that form the rough wall surface. Two of GPR cubes show the coexistence of four sub-vertical fracture families trending N-S, E-W, NW-SE and NE-SW. The NE-SW fracture family is not detected in the outcrop using the scanline method because the fracture is parallel to the direction of the quarry wall, however the LIDAR algorithm found two families planes oriented near this fracture family. This planes are related to the morphological features of NE-SW joints like twist hackles. The 3D fractures constructed with GPR data allow to filter and understand the planes computed with LIDAR data and to determine the sampling bias due to scanline orientation. Subsequently, the LIDAR data and the scanline measures allow to obtain a continuous distribution of the families fractures along the quarry allowing to characterize dip and azimuth variations.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 31
  • 10.3390/rs13244969
Estimating Aboveground Carbon Stock at the Scale of Individual Trees in Subtropical Forests Using UAV LiDAR and Hyperspectral Data
  • Dec 7, 2021
  • Remote Sensing
  • Haiming Qin + 3 more

Accurate estimation of aboveground carbon stock for individual trees is important for evaluating forest carbon sequestration potential and maintaining ecosystem carbon balance. Airborne light detection and ranging (LiDAR) data has been widely used to estimate tree-level carbon stock. However, few studies have explored the potential of combining LiDAR and hyperspectral data to estimate tree-level carbon stock. The objective of this study is to explore the potential of integrating unmanned aerial vehicle (UAV) LiDAR with hyperspectral data for tree-level aboveground carbon stock estimation. To achieve this goal, we first delineated individual trees by a CHM-based watershed segmentation algorithm. We then extracted structural and spectral features from UAV LiDAR and hyperspectral data respectively. Then, Pearson correlation analysis was conducted to assess the correlation between LiDAR features, hyperspectral features, and tree-level carbon stock, based on which, features were selected for model development. Finally, we developed tree-level carbon stock estimation models based on the Schumacher–Hall formula and stepwise multiple regression. Results showed that both LiDAR and hyperspectral features were strongly correlated to tree-level carbon stock. Both tree height (H, r = 0.75) and Green index (GI, r = 0.83) had the highest correlation coefficients with tree-level carbon stock in LiDAR and hyperspectral features, respectively. The best model using LiDAR features alone includes the metrics of H, the 10th height percentile of points (PH10), and mean height of points (Hmean), and can explain 74% of the variations in tree-level carbon stock. Similarly, the best model using hyperspectral data includes GI and modified normalized differential vegetation index (mNDVI), and has similar explanatory power (r2 = 0.75). The model that integrates predictors, namely, GI and the 95th height percentile of points (PH95) from hyperspectral and LiDAR data, substantially improves the explanatory power (r2 = 0.89). These results indicated that while either LiDAR data or hyperspectral data alone can estimate tree-level carbon stock with reasonable accuracy, combining LiDAR and hyperspectral features can substantially improve the explanatory power of the model. Such results suggested that tree-level carbon stock estimation can greatly benefit from the complementary nature of LiDAR-detected structural characteristics and hyperspectral-captured spectral information of vegetation.

  • Research Article
  • Cite Count Icon 14
  • 10.3390/s22155735
Dual-Coupled CNN-GCN-Based Classification for Hyperspectral and LiDAR Data.
  • Jul 31, 2022
  • Sensors
  • Lei Wang + 1 more

Deep learning techniques have brought substantial performance gains to remote sensing image classification. Among them, convolutional neural networks (CNN) can extract rich spatial and spectral features from hyperspectral images in a short-range region, whereas graph convolutional networks (GCN) can model middle- and long-range spatial relations (or structural features) between samples on their graph structure. These different features make it possible to classify remote sensing images finely. In addition, hyperspectral images and light detection and ranging (LiDAR) images can provide spatial-spectral information and elevation information of targets on the Earth’s surface, respectively. These multi-source remote sensing data can further improve classification accuracy in complex scenes. This paper proposes a classification method for HS and LiDAR data based on a dual-coupled CNN-GCN structure. The model can be divided into a coupled CNN and a coupled GCN. The former employs a weight-sharing mechanism to structurally fuse and simplify the dual CNN models and extracting the spatial features from HS and LiDAR data. The latter first concatenates the HS and LiDAR data to construct a uniform graph structure. Then, the dual GCN models perform structural fusion by sharing the graph structures and weight matrices of some layers to extract their structural information, respectively. Finally, the final hybrid features are fed into a standard classifier for the pixel-level classification task under a unified feature fusion module. Extensive experiments on two real-world hyperspectral and LiDAR data demonstrate the effectiveness and superiority of the proposed method compared to other state-of-the-art baseline methods, such as two-branch CNN and context CNN. In particular, the overall accuracy (99.11%) on Trento achieves the best classification performance reported so far.

  • Research Article
  • Cite Count Icon 135
  • 10.1016/j.rse.2019.111323
Combining LiDAR and hyperspectral data for aboveground biomass modeling in the Brazilian Amazon using different regression algorithms
  • Aug 7, 2019
  • Remote Sensing of Environment
  • Catherine Torres De Almeida + 11 more

Combining LiDAR and hyperspectral data for aboveground biomass modeling in the Brazilian Amazon using different regression algorithms

  • Research Article
  • Cite Count Icon 21
  • 10.1109/jstars.2018.2868142
Fusion of Hyperspectral and LiDAR Data Using Discriminant Correlation Analysis for Land Cover Classification
  • Oct 1, 2018
  • IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
  • Farah Jahan + 3 more

It is evident that using complementary features from different sensors is effective for land cover classification. Therefore, combining complementary information from hyperspectral (HS) and light detection and ranging (LiDAR) data can greatly assist in such applications. In this paper, we propose a model for land cover classification, which extracts effective features representing different characteristics (e.g., spectral, geometrical/structural) of objects of interest from these two complementary data sources (e.g., HS and LiDAR) and fuse them effectively by incorporating dimensionality reduction technique. The HS bands are first grouped based on their joint entropy and structural similarity for group-wise spatial feature extraction. The spectral and spatial features from HS are then fused in parallel via discriminant correlation analysis (DCA) method for each band group. This is followed by a multisource fusion step between the spatial features extracted from HS and LiDAR data using DCA. The resultant features from both band-group fusion and multisource fusion steps are concatenated with several other features extracted from HS and LiDAR data. In the proposed model, DCA fusion produces discriminative features by eliminating between-class correlations and confining within-class correlations. We compare the performance of our feature extraction and fusion scheme using random forest and support vector machine classifiers. We also compare our approach with several state-of-the-art approaches on two benchmark land cover datasets and show that our approach outperforms the alternatives by a large margin.

  • Research Article
  • Cite Count Icon 29
  • 10.3389/fpls.2022.964769
Identification of tree species based on the fusion of UAV hyperspectral image and LiDAR data in a coniferous and broad-leaved mixed forest in Northeast China
  • Sep 23, 2022
  • Frontiers in Plant Science
  • Hao Zhong + 7 more

Rapid and accurate identification of tree species via remote sensing technology has become one of the important means for forest inventory. This paper is to develop an accurate tree species identification framework that integrates unmanned airborne vehicle (UAV)-based hyperspectral image and light detection and ranging (LiDAR) data under the complex condition of natural coniferous and broad-leaved mixed forests. First, the UAV-based hyperspectral image and LiDAR data were obtained from a natural coniferous and broad-leaved mixed forest in the Maoer Mountain area of Northeast China. The preprocessed LiDAR data was segmented using a distance-based point cloud clustering algorithm to obtain the point cloud of individual trees; the hyperspectral image was segmented using the projection outlines of individual tree point clouds to obtain the hyperspectral data of individual trees. Then, different hyperspectral and LiDAR features were extracted, respectively, and the importance of the features was analyzed by a random forest (RF) algorithm in order to select appropriate features for the single-source and multi-source data. Finally, tree species identification in the study area were conducted by using a support vector machine (SVM) algorithm together with hyperspectral features, LiDAR features and fused features, respectively. Results showed that the total accuracy for individual tree segmentation was 84.62%, and the fused features achieved the best accuracy for identification of the tree species (total accuracy = 89.20%), followed by the hyperspectral features (total accuracy = 86.08%) and LiDAR features (total accuracy = 76.42%). The optimal features for tree species identification based on fusion of the hyperspectral and LiDAR data included the vegetation indices that were sensitive to the chlorophyll, anthocyanin and carotene contents in the leaves, the partial components of the transformed independent component analysis (ICA), minimum noise fraction (MNF) and principal component analysis (PCA), and the intensity features of the LiDAR echo, respectively. It was concluded that the framework developed in this study was effective in tree species identification under the complex conditions of natural coniferous and broad-leaved mixed forest and the fusion of UAV-based hyperspectral image and LiDAR data can achieve enhanced accuracy compared the single-source UAV-based remote sensing data.

  • Research Article
  • Cite Count Icon 7
  • 10.1080/01431161.2021.1939906
Unsupervised segmentation of LiDAR fused hyperspectral imagery using pointwise mutual information
  • Jun 23, 2021
  • International Journal of Remote Sensing
  • Orhan Torun + 1 more

In the segmentation of hyperspectral images (HSI), difficulties arise when different objects with similar spectral characteristics are being distinguished. If these objects with similar spectral information have different altitudes, it is possible to partition them based on the elevation information which can be obtained with a light detection and ranging (LiDAR) sensor. In this study, we propose a new affinity matrix to be used in a spectral clustering (SC) framework for the unsupervised segmentation of HSI and LiDAR data. To compose this new affinity matrix, spatial-spectral information obtained from HSI and elevation information obtained from LiDAR are combined using Pointwise Mutual Information (PMI). PMI is a measure of how much one pixel tells about the others in an image. It relies on the fact that the pixels of the same object in the given scene have a higher statistical dependence than the pixels of different objects. Hence, segmenting HSI and LiDAR data using PMI can provide a more comprehensive interpretation of the objects in images. The experimental results on two different real data sets show that the proposed method is very effective for unsupervised segmentation of HSI and LiDAR data and it is much faster when compared to competing spectral clustering algorithms.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 48
  • 10.3390/rs10122019
Classification of Expansive Grassland Species in Different Growth Stages Based on Hyperspectral and LiDAR Data
  • Dec 12, 2018
  • Remote Sensing
  • Adriana Marcinkowska-Ochtyra + 3 more

Expansive species classification with remote sensing techniques offers great support for botanical field works aimed at detection of their distribution within areas of conservation value and assessment of the threat caused to natural habitats. Large number of spectral bands and high spatial resolution allows for identification of particular species. LiDAR (Light Detection and Ranging) data provide information about areas such as vegetation structure. Because the species differ in terms of features during the growing season, it is important to know when their spectral responses are unique in the background of the surrounding vegetation. The aim of the study was to identify two expansive grass species: Molinia caerulea and Calamagrostis epigejos in the Natura 2000 area in Poland depending on the period and dataset used. Field work was carried out during late spring, summer and early autumn, in parallel with remote sensing data acquisition. Airborne 1-m resolution HySpex images and LiDAR data were used. HySpex images were corrected geometrically and atmospherically before Minimum Noise Fraction (MNF) transformation and vegetation indices calculation. Based on a LiDAR point cloud generated Canopy Height Model, vegetation structure from discrete and full-waveform data and topographic indexes were generated. Classifications were performed using a Random Forest algorithm. The results show post-classification maps and their accuracies: Kappa value and F1 score being the harmonic mean of producer (PA) and user (UA) accuracy, calculated iteratively. Based on these accuracies and botanical knowledge, it was possible to assess the best identification date and dataset used for analysing both species. For M. caerulea the highest median Kappa was 0.85 (F1 = 0.89) in August and for C. epigejos 0.65 (F1 = 0.73) in September. For both species, adding discrete or full-waveform LiDAR data improved the results. We conclude that hyperspectral (HS) and LiDAR airborne data could be useful to identify grassland species encroaching into Natura 2000 habitats and for supporting their monitoring.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.3390/f11121252
Estimation of Tree Height by Combining Low Density Airborne LiDAR Data and Images Using the 3D Tree Model: A Case Study in a Subtropical Forest in China
  • Nov 26, 2020
  • Forests
  • Xiaocheng Zhou + 4 more

In general, low density airborne LiDAR (Light Detection and Ranging) data are typically used to obtain the average height of forest trees. If the data could be used to obtain the tree height at the single tree level, it would greatly extend the usage of the data. Since the tree top position is often missed by the low density LiDAR pulse point, the estimated forest tree height at the single tree level is generally lower than the actual tree height when low density LiDAR data are used for the estimation. To resolve this problem, in this paper, a modified approach based on three-dimensional (3D) parameter tree model was adopted to reconstruct the tree height at the single tree level by combining the characteristics of high resolution remote sensing images and low density airborne LiDAR data. The approach was applied to two coniferous forest plots in the subtropical forest region, Fujian Province, China. The following conclusions were reached after analyzing the results: The marker-controlled watershed segmentation method is able to effectively extract the crown profile from sub meter-level resolution images without the aid of the height information of LiDAR data. The adaptive local maximum method satisfies the need for detecting the vertex of a single tree crown. The improved following-valley approach is available for estimating the tree crown diameter. The 3D parameter tree model, which can take advantage of low-density airborne LiDAR data and high resolution images, is feasible for improving the estimation accuracy of the tree height. Compared to the tree height results from only using the low density LiDAR data, this approach can achieve higher estimation accuracy. The accuracy of the tree height estimation at the single tree level for two test areas was more than 80%, and the average estimation error of the tree height was 0.7 m. The modified approach based on the three-dimensional parameter tree model can effectively increase the estimation accuracy of individual tree height by combining the characteristics of high resolution remote sensing images and low density airborne LiDAR data.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 11
  • 10.3390/rs12020217
Direct Estimation of Forest Leaf Area Index based on Spectrally Corrected Airborne LiDAR Pulse Penetration Ratio
  • Jan 8, 2020
  • Remote Sensing
  • Yonghua Qu + 6 more

The leaf area index (LAI) is a crucial structural parameter of forest canopies. Light Detection and Ranging (LiDAR) provides an alternative to passive optical sensors in the estimation of LAI from remotely sensed data. However, LiDAR-based LAI estimation typically relies on empirical models, and such methods can only be applied when the field-based LAI data are available. Compared with an empirical model, a physically-based model—e.g., the Beer–Lambert law based light extinction model—is more attractive due to its independent dataset with training. However, two challenges are encountered when applying the physically-based model to estimate LAI from discrete LiDAR data: i.e., deriving the gap fraction and the extinction coefficient from the LiDAR data. We solved the first problem by integrating LiDAR and hyperspectral data to transfer the LiDAR penetration ratio to the forest gap fraction. For the second problem, the extinction coefficient was estimated from tiled (1 km × 1 km) LiDAR data by nonlinearly optimizing the cost function of the angular LiDAR gap fraction and simulated gap fraction from the Beer–Lambert law model. A validation against LAI-2000 measurements showed that the estimates were significantly correlated to the reference LAI with an R2 of 0.66, a root mean square error (RMSE) of 0.60 and a relative RMSE of 0.15. We conclude that forest LAI can be directly estimated by the nonlinear optimization method utilizing the Beer–Lambert model and a spectrally corrected LiDAR penetration ratio. The significance of the proposed method is that it can produce reliable remotely sensed forest LAI from discrete LiDAR and spectral data when field-measured LAI are unavailable.

  • Research Article
  • Cite Count Icon 27
  • 10.1179/0093469012z.00000000026
Integrating LiDAR data and conventional mapping of the Fort Center site in south-central Florida: A comparative approach
  • Nov 1, 2012
  • Journal of Field Archaeology
  • Thomas J Pluckhahn + 1 more

Publicly available LiDAR (Light Detection and Ranging) data provide a potential windfall for archaeologists, permitting the creation of detailed topographic site maps with little more than an internet-connected computer and appropriate software. The quality of these LiDAR data for site mapping is variable, however, and may need to be supplemented with data obtained from conventional mapping techniques. We share insights from recent mapping of the Fort Center site (8GL13) in southern Florida. Specifically, we suggest a method—based on trial and error—for integrating LiDAR and total station survey data. We compare the results of our work with previous efforts at mapping the site based solely on conventional archaeological survey methods, as well as with results based on LiDAR data alone. We conclude that our combination of LiDAR data, corrected by conventional survey data, produces the most accurate map.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon