Abstract
In this paper, we consider the problem of reconstructing a pathway for a given set of proteins based on available genomics and proteomics information such as gene expression data. In all previous approaches, the scoring function for a candidate pathway usually only depends on adjacent proteins in the pathway. We propose to also consider proteins that are of distance two in the pathway (we call them Level-2 neighbours). We derive a scoring function based on both adjacent proteins and Level-2 neighbours in the pathway and show that our scoring function can increase the accuracy of the predicted pathways through a set of experiments. The problem of computing the pathway with optimal score, in general, is NP-hard. We thus extend a randomised algorithm to make it work on our scoring function to compute the optimal pathway with high probability.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have