Abstract

Genomic microarray databases encompass complex high dimensional gene expression samples. Imbalanced microarray datasets refer to uneven distribution of genomic samples among different contributed classes which can negatively affect the classification performance. Therefore, gene selection from imbalanced microarray dataset can give rise to misleading, and inconsistent nominated genes that would alter the classification performance. Such unsatisfactory classification performance is due to the skewed distribution of the samples across the microarrays toward the majority class. In this paper, we propose a modified version of Emperor Penguin Optimization (EPO) algorithm combined with Random Forest (RF) of Bagging and Boosting Classification named by EPO-RF to select the most informative genes based on classification accuracy using imbalanced microarray datasets. The modified version of EPO was built to be based on decision trees that takes in consideration the criterion of tree splitting weights to handle the imbalanced microarray datasets. Average gene expression binary values are used as a preliminary step for exploring disease trajectories with the aid of metaheuristic optimization feature selection algorithms. Results show that the proposed model revealed its superiority compared to well-known established metaheuristic optimization algorithms, e.g., Harris Hawks Optimization (HHO), Grey Wolf Optimization (GWO), Salp Swarm Optimization (SSO), Particle Swarm Optimization (PSO), and Genetic Algorithms (GA’s) using several pediatric sepsis microarray datasets for patients who admitted to the Intensive Care Unit (ICU) for the first 24 h.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.