Single Machine Learning Algorithm Research Articles

Increasing numbers of explanatory variables tend to result in information redundancy and “dimensional disaster” in the quantitative remote sensing of forest aboveground biomass (AGB). Feature selection of model factors is an effective method for improving the accuracy of AGB estimates. Machine learning algorithms are also widely used in AGB estimation, although little research has addressed the use of the categorical boosting algorithm (CatBoost) for AGB estimation. Both feature selection and regression for AGB estimation models are typically performed with the same machine learning algorithm, but there is no evidence to suggest that this is the best method. Therefore, the present study focuses on evaluating the performance of the CatBoost algorithm for AGB estimation and comparing the performance of different combinations of feature selection methods and machine learning algorithms. AGB estimation models of four forest types were developed based on Landsat OLI data using three feature selection methods (recursive feature elimination (RFE), variable selection using random forests (VSURF), and least absolute shrinkage and selection operator (LASSO)) and three machine learning algorithms (random forest regression (RFR), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost)). Feature selection had a significant influence on AGB estimation. RFE preserved the most informative features for AGB estimation and was superior to VSURF and LASSO. In addition, CatBoost improved the accuracy of the AGB estimation models compared with RFR and XGBoost. AGB estimation models using RFE for feature selection and CatBoost as the regression algorithm achieved the highest accuracy, with root mean square errors (RMSEs) of 26.54 Mg/ha for coniferous forest, 24.67 Mg/ha for broad-leaved forest, 22.62 Mg/ha for mixed forests, and 25.77 Mg/ha for all forests. The combination of RFE and CatBoost had better performance than the VSURF–RFR combination in which random forests were used for both feature selection and regression, indicating that feature selection and regression performed by a single machine learning algorithm may not always ensure optimal AGB estimation. It is promising to extending the application of new machine learning algorithms and feature selection methods to improve the accuracy of AGB estimates.

This study proposed a novel double machine learning (DML) approach to merge multiple satellite-based precipitation products (SPPs) and gauge observations, and tested its reliability and validity over the Chinese mainland. The DML approach was mainly developed based on the classification model of random forest (RF) in combination with the regression models of the machine learning (ML) algorithms including RF, artificial neural network (ANN), support vector machine (SVM) and extreme learning machine (ELM). This led to four DML algorithms, i.e., RF-RF, RF-ANN, RF-SVM, and RF-LM. The performance of the DML algorithms were compared to the single machine learning (SML) algorithms developed based solely on the regression models of RF, ANN, SVM, and ELM, and to the liner merging methods including the inverse error variance weighting, the one-outlier-removed average, and the optimized weight average. In total, we produced twelve precipitation products including four of the DML algorithms, four of the SML algorithms, three of the liner merging methods, and another one generated via the gauge-only interpolation. The precipitation observations at 697 gauges were spatially and randomly divided into two parts (i.e., 70% and 30%), one was used for the training of the ML algorithms or for the interpolation, while the other for the performance evaluations. Results indicate that the DML algorithms outperform the other merging methods, the gauge-only interpolation, and the original SPPs over the Chinese mainland. The median Kling-Gupta efficiency (KGE) ranges 0.67–0.71 for the merged products of DML, which are obviously higher than the original SPPs (0.31–0.54), the linear merged product (0.54–0.55), gauge-only interpolated product (0.62), and the SML-based products (0.47–0.65). The DML-based products also exhibit better performances than the other products in detecting precipitation events with the threshold of 1 mm/day, and outperform the original SPPs regardless of the precipitation thresholds. Further analyses imply that: (i) the DML-based products could outperform the original SPPs even with a small training dataset size; (ii) the superiority of the DML approach to SML is mainly due to that the former can better capture the temporal dynamics of precipitation; (iii) the added values of the merged products of DML relative to the original SPPs and the gauge-only product vary with the sizes of the training dataset; and (iv) the ensemble of the DML algorithms could not further improve the accuracy of the precipitation estimates. This study not only provided an effective and robust tool for the fusion of multiple SPPs and gauge observations, but also, for the first time, compared the performance of various ML algorithms in merging satellite and gauge-based precipitation.

Single Machine Learning Algorithm Research Articles

Related Topics

Articles published on Single Machine Learning Algorithm

A supervised ensemble learning method for fault diagnosis in photovoltaic strings

Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass

A data fusion approach to optimize compositional stability of halide perovskites

Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer

Merging multiple satellite-based precipitation products and gauge observations using a novel double machine learning approach

A particle swarm optimization based ensemble for vegetable crop disease recognition

Hybrid randomised learning‐based probabilistic data‐driven method for fault‐induced delayed voltage recovery assessment of power systems

Design of an Accurate Machine Learning Algorithm to Predict the Binding Energies of Several Adsorbates on Multiple Sites of Metal Surfaces

Super ensemble learning for daily streamflow forecasting: large-scale demonstration and comparison with multiple machine learning algorithms

A Robust Blood-based Signature of Cerebrospinal Fluid Aβ42 Status.

Prediction of photovoltaic power output based on similar day analysis, genetic algorithm and extreme learning machine

Heat load prediction of residential buildings based on discrete wavelet transform and tree-based ensemble learning

Type 2 Machine Learning: An Effective Hybrid Prediction Model for Early Type 2 Diabetes Detection

Recognition of Bangla Handwritten Number Using Combination of PCA and FIS with the Aid of DWT

Application of ensemble learning techniques to model the atmospheric concentration of SO2.

A Novel Approach of Weighted Support Vector Machine with Applied Chance Theory for Forecasting Air Pollution Phenomenon in Egypt

Ensemble Methods with Voting Protocols Exhibit Superior Performance for Predicting Cancer Clinical Endpoints and Providing More Complete Coverage of Disease-Related Genes.

Using Arbiter and Combiner Tree to Classify Contexts of Data

An Operational Framework for Land Cover Classification in the Context of REDD+ Mechanisms. A Case Study from Costa Rica

Indonesian Named-entity Recognition for 15 Classes Using Ensemble Supervised Learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Single Machine Learning Algorithm Research Articles

Related Topics

Articles published on Single Machine Learning Algorithm

A supervised ensemble learning method for fault diagnosis in photovoltaic strings

Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass

A data fusion approach to optimize compositional stability of halide perovskites

Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer

Merging multiple satellite-based precipitation products and gauge observations using a novel double machine learning approach

A particle swarm optimization based ensemble for vegetable crop disease recognition

Hybrid randomised learning‐based probabilistic data‐driven method for fault‐induced delayed voltage recovery assessment of power systems

Design of an Accurate Machine Learning Algorithm to Predict the Binding Energies of Several Adsorbates on Multiple Sites of Metal Surfaces

Super ensemble learning for daily streamflow forecasting: large-scale demonstration and comparison with multiple machine learning algorithms

A Robust Blood-based Signature of Cerebrospinal Fluid Aβ42 Status.

Prediction of photovoltaic power output based on similar day analysis, genetic algorithm and extreme learning machine

Heat load prediction of residential buildings based on discrete wavelet transform and tree-based ensemble learning

Type 2 Machine Learning: An Effective Hybrid Prediction Model for Early Type 2 Diabetes Detection

Recognition of Bangla Handwritten Number Using Combination of PCA and FIS with the Aid of DWT

Application of ensemble learning techniques to model the atmospheric concentration of SO2.

A Novel Approach of Weighted Support Vector Machine with Applied Chance Theory for Forecasting Air Pollution Phenomenon in Egypt

Ensemble Methods with Voting Protocols Exhibit Superior Performance for Predicting Cancer Clinical Endpoints and Providing More Complete Coverage of Disease-Related Genes.

Using Arbiter and Combiner Tree to Classify Contexts of Data

An Operational Framework for Land Cover Classification in the Context of REDD+ Mechanisms. A Case Study from Costa Rica

Indonesian Named-entity Recognition for 15 Classes Using Ensemble Supervised Learning