Abstract

Protein-protein interaction (PPI) networks are the networks of protein complexes formed by biochemical events and electrostatic forces. PPI networks can be used to study diseases and discover drugs. The causes of diseases are evident on a protein interaction level. For instance, an elevation of interaction edge weights of oncogenes is manifested in cancers. Further, the majority of approved drugs target a particular PPI, and thus studying PPI networks is vital to drug discovery. The availability of large datasets and need for efficient analysis necessitate the design of scalable methods leveraging modern high-performance computing (HPC) platforms. In this paper, we design a lightweight framework on a distributed-memory parallel system, which includes scalable algorithmic and analytic techniques to study PPI networks and visualize them. Our study of PPIs is based on network-centric mining and analysis approaches. Since PPI networks are signed (labeled) and weighted, many existing network mining methods working on simple unweighted networks are unsuitable to study PPIs. Further, the large volume and variety of such data limit the use of sequential tools or methods. Many existing tools also do not support a convenient workflow starting from automated data preprocessing to visualizing results and reports for efficient extraction of intelligence from large-scale PPI networks. Our framework supports automated analytics based on a large range of extensible methods for extracting signed motifs, computing centrality, and finding functional units. We design MPI (Message Passing Interface) based parallel methods and workflow, which scale to large networks. The framework is also extensible and sufficiently generic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call