Monitoring Forest Health Using Hyperspectral Imagery: Does Feature Selection Improve the Performance of Machine-Learning Techniques?

Patrick Schratz,Alexander Brenning,Eugenia Iturritxa,Bernd Bischl,Jannes Muenchow,José Cortés

doi:10.3390/rs13234832

Abstract

This study analyzed highly correlated, feature-rich datasets from hyperspectral remote sensing data using multiple statistical and machine-learning methods. The effect of filter-based feature selection methods on predictive performance was compared. In addition, the effect of multiple expert-based and data-driven feature sets, derived from the reflectance data, was investigated. Defoliation of trees (%), derived from in situ measurements from fall 2016, was modeled as a function of reflectance. Variable importance was assessed using permutation-based feature importance. Overall, the support vector machine (SVM) outperformed other algorithms, such as random forest (RF), extreme gradient boosting (XGBoost), and lasso (L1) and ridge (L2) regressions by at least three percentage points. The combination of certain feature sets showed small increases in predictive performance, while no substantial differences between individual feature sets were observed. For some combinations of learners and feature sets, filter methods achieved better predictive performances than using no feature selection. Ensemble filters did not have a substantial impact on performance. The most important features were located around the red edge. Additional features in the near-infrared region (800–1000 nm) were also essential to achieve the overall best performances. Filter methods have the potential to be helpful in high-dimensional situations and are able to improve the interpretation of feature effects in fitted models, which is an essential constraint in environmental modeling studies. Nevertheless, more training data and replication in similar benchmarking studies are needed to be able to generalize the results.

Highlights

The use of machine learning (ML) algorithms for analyzing remote sensing data has seen a huge increase in the last decade [1]
principal component analysis (PCA) was used to assess the complexity of the three feature sets
Performance differences between test folds were large: Predicting on Luiando resulted in an root mean square error (RMSE) of 9.0 p.p. for learner support vector machine (SVM) but up to 54.3 p.p. when testing on Laukiz2 (Table 4)

Summary

Introduction

The use of machine learning (ML) algorithms for analyzing remote sensing data has seen a huge increase in the last decade [1]. This coincided with the increased availability of remote sensing imagery, especially since the launch of the first Sentinel satellite in the year. Scientists can nowadays process large amounts of (environmental) information with relative ease using various learning algorithms. This makes it possible to extend benchmark comparison matrices of studies in a semi-automated way, possibly stumbling upon unexpected findings, such as process settings, that would not have been explored otherwise [2].

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Remote Sensing	Publication Date: Nov 28, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Monitoring Forest Health Using Hyperspectral Imagery: Does Feature Selection Improve the Performance of Machine-Learning Techniques?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing

Lead the way for us

Similar Papers

Predicting underestimation of ductal carcinoma in situ: a comparison between radiomics and conventional approaches.
Jiao Li ... Jinhua Wang
International Journal of Computer Assisted Radiology and Surgery | VOL. 14
Jiao Li, et. al.Jiao Li ... Jinhua Wang
19 Dec 2018
International Journal of Computer Assisted Radiology and Surgery | VOL. 14

A New Framework for Precise Identification of Prostatic Adenocarcinoma.
Sarah M Ayyad ... Ali Mahmoud
Sensors | VOL. 22
Sarah M Ayyad, et. al.Sarah M Ayyad ... Ali Mahmoud
26 Feb 2022
Sensors | VOL. 22

Interpretable Machine Learning Model for Predicting Pathologic Complete Response in Patients with Rectal Adenocarcinoma Treated with Chemoradiation Therapy
D Wang ... Y Xiao
International Journal of Radiation Oncology*Biology*Physics | VOL. 114
D Wang, et. al.D Wang ... Y Xiao
22 Oct 2022
International Journal of Radiation Oncology*Biology*Physics | VOL. 114

Machine learning for automated pain assessment using physiological signals
Fatemeh Pouromran
-
Fatemeh PouromranFatemeh Pouromran
24 Aug 2022
24 Aug 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Monitoring Forest Health Using Hyperspectral Imagery: Does Feature Selection Improve the Performance of Machine-Learning Techniques?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Remote Sensing