Abstract

Semi-Supervised Learning (SSL)is an approach to machine learning that makes use of unlabeled data for training with a small amount of labeled data. In the context of molecular biology and pharmacology, one can take advantage of unlabeled data. For instance, to identify drugs and targets where a few genes are known to be associated with a specific target for drugs and considered as labeled data. Labeling the genes requires laboratory verification and validation. This process is usually very time consuming and expensive. Thus, it is useful to estimate the functional role of drugs from unlabeled data using computational methods. To develop such a model, we used openly available data resources to create (i)drugs and genes, (ii)genes and disease, bipartite graphs. We constructed the genetic embedding graph from the two bipartite graphs using Tensor Factorization methods. We integrated the genetic embedding graph with the publicly available protein functional association network. Our results show the usefulness of the integration by effectively predicting drug labels.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call