Ensemble of Trees for Classifying High-Dimensional Imbalanced Genomic Data

Dewan Md Farid,Bernard Manderick,Ann Nowe

doi:10.1007/978-3-319-56994-9_12

Abstract

Machine learning for data mining applications in the field of bioinformatics is to extract new knowledge to provide an improved and effective diagnosis process for patients. In this paper, we introduce an adaptive ensemble learning for classifying high-dimensional multi-class imbalanced genomic data. The aspect is to design and develop an optimal ensemble method for information discovery on genomic data, which improve the prediction accuracy of DNA variant classification. The proposed method is based on ensemble of decision trees, data pre-processing, feature selection and grouping. It converts an imbalanced genomic data into multiple balanced ones and then builds a number of decision trees on these multiple data with specific feature groups. The outputs of these trees are combined for classifying new instances by majority voting technique. In this empirical study, different ensemble predictive modelling techniques like Random Forest, Boosting and Bagging were compared with the proposed ensemble method. The experimental results on genomic data (148 Exome datasets) of Brugada syndrome from the Centre of Medical Genetics, VUB UZ Brussel show that the proposed method is usually superior to the conventional ensemble learning algorithms when classifying the high-dimensional multi-class imbalanced genomic data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Ensemble of Trees for Classifying High-Dimensional Imbalanced Genomic Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multi-class imbalanced big data classification on Spark
William C Sleeman Iv ... Bartosz Krawczyk
Knowledge-Based Systems | VOL. 212
William C Sleeman Iv, et. al.William C Sleeman Iv ... Bartosz Krawczyk
07 Nov 2020
Knowledge-Based Systems | VOL. 212

A survey of multi-class imbalanced data classification methods
Meng Han ... Shujuan Liu
Journal of Intelligent & Fuzzy Systems | VOL. 44
Meng Han, et. al.Meng Han ... Shujuan Liu
30 Jan 2023
Journal of Intelligent & Fuzzy Systems | VOL. 44

Resampling Imbalanced Data and Impact of Attribute Selection Methods in High Dimensional Data
K Ulaga Priya ... S Pushpa
-
K Ulaga Priya, et. al.K Ulaga Priya ... S Pushpa
01 Jan 2021
01 Jan 2021

Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data
Jiakun Zhao ... Yibo Zhang
Intelligent Data Analysis | VOL. 26
Jiakun Zhao, et. al.Jiakun Zhao ... Yibo Zhang
18 Apr 2022
Intelligent Data Analysis | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ensemble of Trees for Classifying High-Dimensional Imbalanced Genomic Data

Abstract

Talk to us

Similar Papers