AGRN: accurate gene regulatory network inference using ensemble machine learning methods.

Duaa Mohammad Alawad,Md Wasi Ul Kabir,Md Tamjidul Hoque,Ataur Katebi

doi:10.1093/bioadv/vbad032

Duaa Mohammad Alawad, Md Wasi Ul Kabir + Show 2 more

Open Access

PDF Available

https://doi.org/10.1093/bioadv/vbad032

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Biological processes are regulated by underlying genes and their interactions that form gene regulatory networks (GRNs). Dysregulation of these GRNs can cause complex diseases such as cancer, Alzheimer's and diabetes. Hence, accurate GRN inference is critical for elucidating gene function, allowing for the faster identification and prioritization of candidate genes for functional investigation. Several statistical and machine learning-based methods have been developed to infer GRNs based on biological and synthetic datasets. Here, we developed a method named AGRN that infers GRNs by employing an ensemble of machine learning algorithms. From the idea that a single method may not perform well on all datasets, we calculate the gene importance scores using three machine learning methods-random forest, extra tree and support vector regressors. We calculate the importance scores from Shapley Additive Explanations, a recently published method to explain machine learning models. We have found that the importance scores from Shapley values perform better than the traditional importance scoring methods based on almost all the benchmark datasets. We have analyzed the performance of AGRN using the datasets from the DREAM4 and DREAM5 challenges for GRN inference. The proposed method, AGRN-an ensemble machine learning method with Shapley values, outperforms the existing methods both in the DREAM4 and DREAM5 datasets. With improved accuracy, we believe that AGRN inferred GRNs would enhance our mechanistic understanding of biological processes in health and disease. https://github.com/DuaaAlawad/AGRN. Supplementary data are available at Bioinformatics online.

Full Text