Abstract

Abstract Prediction of gene-drug-disease interactions have talented new insights in biology. Discovering unknown interactions will provide new therapeutic approaches to explore gene expressions. Recent improvements in machine learning techniques have gotten considerable interest due to higher efficiency, accurate results, and their lower cost. However, most of the studies were ignoring relevant associations, by representing only drug-disease interactions on a network while public available data offers a large variety of interactions. Additionally, some computational techniques used in this domain are faced with new challenges, related to the organization of heterogeneous data which suffer from a high imbalance rate since there are extensively more non-interacting gene-drug-disease triplets than interacting ones. In this paper we present integration of heterogeneous biological data about genes, drugs, and diseases to build a model, and building a new graph representation relating genedrug-disease interactions. Using extreme gradient boosting (XGBoost) algorithm, we have been able to extract a list of valid interactions about gene-drug-disease triplets, and a list of gene-drug pairs related to lung cancer. Keywords: Biological heterogeneous data, Data integration, Gene-DrugDisease interactions, Machine learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call