Ensemble Selection Research Articles

Abrupt environmental changes can lead to evolutionary shifts in trait evolution. Identifying these shifts is an important step in understanding the evolutionary history of phenotypes. The detection performances of different methods are influenced by many factors, including different numbers of shifts, shift sizes, where a shift occurs on a tree, and the types of phylogenetic structure. Furthermore, the model assumptions are oversimplified, so are likely to be violated in real data, which could cause the methods to fail. We perform simulations to assess the effect of these factors on the performance of shift detection methods. To make the comparisons more complete, we also propose an ensemble variable selection method (R package ELPASO) and compare it with existing methods (R packages ℓ\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell$$\\end{document}1ou and PhylogeneticEM). The performances of methods are highly dependent on the selection criterion. ℓ\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell$$\\end{document}1ou+pBIC is usually the most conservative method and it performs well when signal sizes are large. ℓ\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell$$\\end{document}1ou+BIC is the least conservative method and it performs well when signal sizes are small. The ensemble method provides more balanced choices between those two methods. Moreover, the performances of all methods are heavily impacted by measurement error, tree reconstruction error and shifts in variance.

Read full abstract

Feature selection plays a crucial role in classification tasks as part of the data preprocessing process. Effective feature selection can improve the robustness and interpretability of learning algorithms, and accelerate model learning. However, traditional statistical methods for feature selection are no longer practical in the context of high-dimensional data due to the computationally complex. Ensemble learning, a prominent learning method in machine learning, has demonstrated exceptional performance, particularly in classification problems. To address the issue, we propose a three-stage feature selection algorithm framework for high-dimensional data based on ensemble learning (EFS-GINI). Firstly, highly linearly correlated features are eliminated using the Spearman coefficient. Then, a feature selector based on the F-test is employed for the first stage selection. For the second stage, four feature subsets are formed using mutual information (MI), ReliefF, SURF, and SURF* filters in parallel. The third stage involves feature selection using a combinator based on GINI coefficient. Finally, a soft voting approach is proposed to employ for classification, including decision tree, naive Bayes, support vector machine (SVM), k-nearest neighbors (KNN) and random forest classifiers. To demonstrate the effectiveness and efficiency of the proposed algorithm, eight high-dimensional datasets are used and five feature selection methods are employed to compare with our proposed algorithm. Experimental results show that our method effectively enhances the accuracy and speed of feature selection. Moreover, to explore the biological significance of the proposed algorithm, we apply it on the renal cell carcinoma dataset GSE40435 from the Gene Expression Omnibus database. Two feature genes, NOP2 and NSUN5, are selected by our proposed algorithm. They are directly involved in regulating m5c RNA modification, which reveals the biological importance of EFS-GINI. Through bioinformatics analysis, we shows that m5C-related genes play an important role in the occurrence and progression of renal cell carcinoma, and are expected to become an important marker to predict the prognosis of patients.

Read full abstract

Ensemble Selection Research Articles

Related Topics

Articles published on Ensemble Selection

A penalized variable selection ensemble algorithm for high-dimensional group-structured data.

Ensemble learning classifiers hybrid feature selection for enhancing performance of intrusion detection system

Evolutionary shift detection with ensemble variable selection

Feature selection and its combination with data over-sampling for multi-class imbalanced datasets

Selective ensemble method for anomaly detection based on parallel learning

Cluster ensemble selection based on maximum quality-maximum diversity

A cross-domain intelligent fault diagnosis method based on multi-source domain feature adaptation and selection

The Credit Card Anti-fraud Detection Model in the Context of Dynamic Integration Selection Algorithm

An ensemble learning-based feature selection algorithm for identification of biomarkers of renal cell carcinoma.

Ensemble Feature Selection Method for Single Pulse Classification

Design optimization of a multi-source renewable energy system using a novel method based on selective ensemble learning

Classifier Ensemble Based on Multiview Optimization for High-Dimensional Imbalanced Data Classification.

A Learning Approach for The Identification of Network Intrusions Based on Ensemble XGBoost Classifier

Ensemble Feature Selection Using Neighbourhood Rough Set–Based Multicriterion Fusion

Classification of Breast Cancer using Ensemble Filter Feature Selection with Triplet Attention Based Efficient Net Classifier

Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift.

Road Traffic Crash Severity Analysis: A Bayesian-Optimized Dynamic Ensemble Selection Guided by Instance Hardness and Region of Competence Strategy

A Novel Heuristic-Based Selective Ensemble Prediction Method for Digital Financial Fraud Risk

Ensemble Feature Selection for Age Estimation from Speech

UNMASKING FRAUDSTERS: Ensemble Features Selection to Enhance Random Forest Fraud Detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Ensemble Selection Research Articles

Related Topics

Articles published on Ensemble Selection

A penalized variable selection ensemble algorithm for high-dimensional group-structured data.

Ensemble learning classifiers hybrid feature selection for enhancing performance of intrusion detection system

Evolutionary shift detection with ensemble variable selection

Feature selection and its combination with data over-sampling for multi-class imbalanced datasets

Selective ensemble method for anomaly detection based on parallel learning

Cluster ensemble selection based on maximum quality-maximum diversity

A cross-domain intelligent fault diagnosis method based on multi-source domain feature adaptation and selection

The Credit Card Anti-fraud Detection Model in the Context of Dynamic Integration Selection Algorithm

An ensemble learning-based feature selection algorithm for identification of biomarkers of renal cell carcinoma.

Ensemble Feature Selection Method for Single Pulse Classification

Design optimization of a multi-source renewable energy system using a novel method based on selective ensemble learning

Classifier Ensemble Based on Multiview Optimization for High-Dimensional Imbalanced Data Classification.

A Learning Approach for The Identification of Network Intrusions Based on Ensemble XGBoost Classifier

Ensemble Feature Selection Using Neighbourhood Rough Set–Based Multicriterion Fusion

Classification of Breast Cancer using Ensemble Filter Feature Selection with Triplet Attention Based Efficient Net Classifier

Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift.

Road Traffic Crash Severity Analysis: A Bayesian-Optimized Dynamic Ensemble Selection Guided by Instance Hardness and Region of Competence Strategy

A Novel Heuristic-Based Selective Ensemble Prediction Method for Digital Financial Fraud Risk

Ensemble Feature Selection for Age Estimation from Speech

UNMASKING FRAUDSTERS: Ensemble Features Selection to Enhance Random Forest Fraud Detection