One important challenge in the post-genomic era is to explore disease mechanisms by efficiently integrating different types of biological data. In fact, a single disease is usually caused through multiple genes products such as protein complexes rather than single gene. Therefore, it is meaningful for us to discover protein communities from the protein-protein interaction network and use them for inferring disease-disease associations. In this article, we propose a new framework including protein-protein networks, disease-gene associations and disease-complex pairs to cluster protein complexes and infer disease associations. Complexes discovered by our approach is superior in quality (Sn, PPV and ACC) and clustering quantity than other four popular methods on three PPI networks. A systematic analysis shows that disease pairs sharing more protein complexes (such as Glucose and Lipid Metabolic Disorders) are more similar and overlapping proteins may have different roles in different diseases. These findings can provide clinical scholars and medical practitioners with new ideas on disease identification and treatment.
Read full abstract