Abstract

Multicollinearity between feature bands is one of the main interferences in the process of retrieving chlorophyll-a (Chl-a) concentration in water bodies from hyperspectral data. Meanwhile, the model capability is also a decisive factor for inversion accuracy. To eliminate multicollinearity between feature bands and enhance data-driven of the retrieve of Chl-a concentration, this study proposed a feature bands selection strategy (FD-FI) based on knee-point-detection and variance inflation factor (VIF). Then, to realize model-driven, nine machine learning algorithms are combined to construct a MixModel, which was compared with other models. Chl-a concentration in Nansi Lake was estimated using “Zhuhai-No.1” remote sensing images and field measured data. The results show that the FD-FI strategy can effectively eliminate multicollinearity between bands or band combinations (VIF < 7). Using the same model, the strategy proposed in this study has a higher accuracy than existing strategies. In the five-fold cross-validation, XGBoostFD-FI obtained the best performance with Coefficient of Determinatio (R2) and Root Mean Squared Error (RMSE) of 0.8351 and 6.6477 μg/L. In addition, combined with MixModel, the FD-FI strategy further improves the accuracy of Chl-a retrieval with R2 = 0.8664 and RMSE = 5.7926 μg/L. When the model was applied to remote sensing images, the Chl-a spatial distribution obtained by the FD-FI strategy on the four models was the most consistent. MixModel is more sensitive to very high and low Chl-a concentrations, and its generalisation is more stable. Overall, this study provides an innovative approach for the selection of feature bands and model construction for Chl-a retrieval from inland lakes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call