Wrapper Feature Selection Research Articles

As climate change intensifies, the frequency and severity of waterlogging are expected to increase, necessitating a deeper understanding of the cucumber response to this stress. In this study, three public RNA-seq datasets (PRJNA799460, PRJNA844418, and PRJNA678740) comprising 36 samples were analyzed. Various feature selection algorithms including Uncertainty, Relief, SVM (Support Vector Machine), Correlation, and logistic least absolute shrinkage, and selection operator (LASSO) were performed to identify the most significant genes related to the waterlogging stress response. These feature selection techniques, which have different characteristics, were used to reduce the complexity of the data and thereby identify the most significant genes related to the waterlogging stress response. Uncertainty, Relief, SVM, Correlation, and LASSO identified 4, 4, 10, 21, and 13 genes, respectively. Differential gene correlation analysis (DGCA) focusing on the 36 selected genes identified changes in correlation patterns between the selected genes under waterlogged versus control conditions, providing deeper insights into the regulatory networks and interactions among the selected genes. DGCA revealed significant changes in the correlation of 13 genes between control and waterlogging conditions. Finally, we validated 13 genes using the Random Forest (RF) classifier, which achieved 100% accuracy and a 1.0 Area Under the Curve (AUC) score. The SHapley Additive exPlanations (SHAP) values clearly showed the significant impact of LOC101209599, LOC101217277, and LOC101216320 on the model’s predictive power. In addition, we employed the Boruta as a wrapper feature selection method to further validate our gene selection strategy. Eight of the 13 genes were common across the four feature weighting algorithms, LASSO, DGCA, and Boruta, underscoring the robustness and reliability of our gene selection strategy. Notably, the genes LOC101209599, LOC101217277, and LOC101216320 were among genes identified by multiple feature selection methods from different categories (filtering, wrapper, and embedded). Pathways associated with these specific genes play a pivotal role in regulating stress tolerance, root development, nutrient absorption, sugar metabolism, gene expression, protein degradation, and calcium signaling. These intricate regulatory mechanisms are crucial for cucumbers to adapt effectively to waterlogging conditions. These findings provide valuable insights for uncovering targets in breeding new cucumber varieties with enhanced stress tolerance.

Soil spectroscopy estimates soil properties using the absorption features in soil spectra. However, modelling soil properties with soil spectroscopy is challenging due to the high dimensionality of spectral data. Feature Selection wrapper methods are promising approaches to reduce the dimensionality but are barely used in soil spectroscopy. The aim of this study is to evaluate the performance of two feature selection wrapper methods, Sequential Forward Selection (SFS) and Sequential Flotant Forward Selection (SFFS) built using the Random Forest (RF) algorithm, for dimensionality reduction of spectral data and predictive modelling of modelling soil organic matter (SOM), clay and carbonates. The reflectance of 100 soil samples, acquired from Sierra de las Nieves (Spain), was measured under laboratory conditions using ASD FieldSpec Pro JR. Four different datasets were obtained after applying two spectral preprocessing methods to raw spectra: raw spectra, Continuum Removal (CR), Multiplicative Scatter Correction (MSC), and a so-called “Global” dataset composed of raw, CR and MSC features. The performance of RF models built with feature selection methods was compared to that of Partial Least Squares Regression (PLSR) and RF (alone).RF models built with SFS and SFFS outperformed PLSR and RF alone models: The best RF models with feature selection had a respective ratio of performance to interquartile distance of 1.93, 0.38 and 2.56. PLSR models had an accuracy of 1.41, 0.29 and 1.81 for SOM, carbonates, and clay, respectively. RF alone had a respective performance of 1.29, 0.29 and 1.81. The application of feature selection wrapper methods reduced the number of features to less than 1 % of the starting features. Features were selected across all spectra for SOM and clay, and around 900 nm, 1900 nm, and 2350 nm for carbonates. However, feature selection highlighted features around 1100 nm in SOM modelling, as well as other features around 2200 nm, which is considered a main absorption feature of clay. The application of feature selection with Random Forest was very important in improving modelling accuracy, reducing the redundant features and avoiding the curse of dimensionality or Hughes effect. Thus, this research showed an alternative to dimensionality reduction approaches that have been applied to date to model soil properties with spectroscopy and paves the way for further scientific investigation based on feature selection methods and machine learning.

Wrapper Feature Selection Research Articles

Related Topics

Articles published on Wrapper Feature Selection

LLpowershap: logistic loss-based automated Shapley values feature selection method

IMOABC: An efficient multi-objective filter–wrapper hybrid approach for high-dimensional feature selection

A wrapper feature selection approach using Markov blankets

Turkish Text Classification Based On Wrapper Feature Selection Using Particle Swarm Optimization

Linear Monotonic Inter-electrode Associations as Quantitative EEG for Alcoholism Diagnosis

Uncovering waterlogging-responsive genes in cucumber through machine learning and differential gene correlation analysis

Financial distress prediction using an improved particle swarm optimization wrapper feature selection method and tree boosting ensemble

Exploring Machine Learning Algorithms to Predict Diarrhea Disease and Identify its Determinants among Under-Five Years Children in East Africa

Advancing XSS Detection in IoT over 5G: A Cutting-Edge Artificial Neural Network Approach

Predicting pedalling metrics based on lower limb joint kinematics

Investigating the Performance of a Novel Modified Binary Black Hole Optimization Algorithm for Enhancing Feature Selection

Detection of chronic kidney disease using binary whale optimization algorithm

Potential dual inhibitors of Hexokinases and mitochondrial complex I discovered through machine learning approach

ONE3A: one-against-all authentication model for smartphone using GAN network and optimization techniques.

A novel two-stage wrapper feature selection approach based on greedy search for text sentiment classification

Machine Learning and Feature Selection for soil spectroscopy. An evaluation of Random Forest wrappers to predict soil organic matter, clay, and carbonates

Predictive modeling for early detection of biliary atresia in infants with cholestasis: Insights from a machine learning study

Feature selection based on dynamic crow search algorithm for high-dimensional data classification

Optimizing prediction accuracy for early recurrent lumbar disc herniation with a directional mutation-guided SVM model

Efficient traffic-based IoT device identification using a feature selection approach with Lévy flight-based sine chaotic sub-swarm binary honey badger algorithm

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Wrapper Feature Selection Research Articles

Related Topics

Articles published on Wrapper Feature Selection

LLpowershap: logistic loss-based automated Shapley values feature selection method

IMOABC: An efficient multi-objective filter–wrapper hybrid approach for high-dimensional feature selection

A wrapper feature selection approach using Markov blankets

Turkish Text Classification Based On Wrapper Feature Selection Using Particle Swarm Optimization

Linear Monotonic Inter-electrode Associations as Quantitative EEG for Alcoholism Diagnosis

Uncovering waterlogging-responsive genes in cucumber through machine learning and differential gene correlation analysis

Financial distress prediction using an improved particle swarm optimization wrapper feature selection method and tree boosting ensemble

Exploring Machine Learning Algorithms to Predict Diarrhea Disease and Identify its Determinants among Under-Five Years Children in East Africa

Advancing XSS Detection in IoT over 5G: A Cutting-Edge Artificial Neural Network Approach

Predicting pedalling metrics based on lower limb joint kinematics

Investigating the Performance of a Novel Modified Binary Black Hole Optimization Algorithm for Enhancing Feature Selection

Detection of chronic kidney disease using binary whale optimization algorithm

Potential dual inhibitors of Hexokinases and mitochondrial complex I discovered through machine learning approach

ONE3A: one-against-all authentication model for smartphone using GAN network and optimization techniques.

A novel two-stage wrapper feature selection approach based on greedy search for text sentiment classification

Machine Learning and Feature Selection for soil spectroscopy. An evaluation of Random Forest wrappers to predict soil organic matter, clay, and carbonates

Predictive modeling for early detection of biliary atresia in infants with cholestasis: Insights from a machine learning study

Feature selection based on dynamic crow search algorithm for high-dimensional data classification

Optimizing prediction accuracy for early recurrent lumbar disc herniation with a directional mutation-guided SVM model

Efficient traffic-based IoT device identification using a feature selection approach with Lévy flight-based sine chaotic sub-swarm binary honey badger algorithm