Abstract
Genomic microarray databases encompass complex high dimensional gene expression samples. Imbalanced microarray datasets refer to uneven distribution of genomic samples among different contributed classes which can negatively affect the classification performance. Therefore, gene selection from imbalanced microarray dataset can give rise to misleading, and inconsistent nominated genes that would alter the classification performance. Such unsatisfactory classification performance is due to the skewed distribution of the samples across the microarrays toward the majority class. In this paper, we propose a modified version of Emperor Penguin Optimization (EPO) algorithm combined with Random Forest (RF) of Bagging and Boosting Classification named by EPO-RF to select the most informative genes based on classification accuracy using imbalanced microarray datasets. The modified version of EPO was built to be based on decision trees that takes in consideration the criterion of tree splitting weights to handle the imbalanced microarray datasets. Average gene expression binary values are used as a preliminary step for exploring disease trajectories with the aid of metaheuristic optimization feature selection algorithms. Results show that the proposed model revealed its superiority compared to well-known established metaheuristic optimization algorithms, e.g., Harris Hawks Optimization (HHO), Grey Wolf Optimization (GWO), Salp Swarm Optimization (SSO), Particle Swarm Optimization (PSO), and Genetic Algorithms (GA’s) using several pediatric sepsis microarray datasets for patients who admitted to the Intensive Care Unit (ICU) for the first 24 h.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.