A Prediction of Pediatric Cardiomyopathy Disease Associated Genes using Machine Learning Algorithms

doi:10.35940/ijrte.b1190.0882s819

Abstract

Pediatric cardiomyopathy is considered as one of the heart diseases, which causes by abnormal disorder of the heart muscle. If pediatric cardiomyopathy remains untreated and unidentified at the early stages, it leads to heart failure. The global number of deaths and disability attributed to cardiomyopathy has steadily increased. Hence, machine learning approaches can solves the problem of identifying the critical problem by determining the pediatric cardiomyopathy disease associated genes from the collection of differentially expressed genes that are recognized by biological process of genes. The main objective of this study is to design a machine learning model which can predict the likelihood of pediatric cardiomyopathy in genes specified biological features with maximum of accuracy. Identified high throughput machine learning algorithms like Logistic Regression, Naive Bayes, Random Forest, and Support Vector Machine were used in this experiment to determine the genes which can be derived from internal database repository having biological process of genes specified. Experiments are conducted on Gene Expression Omnibus (GEO) datasets which sourced from cardiogenomics.org and Biohunter tool. The performance of these machine learning algorithms is evaluated on various measures like Accuracy, Precision, Recall, F-Measure, and Receiver Operating Characteristics (ROC). From the obtained results shows that Random Forest provides high accuracy 84.4% when compared to other four machine learning algorithms.

Full Text