PCGAN: a generative approach for protein complex identification from protein interaction networks.

Yuliang Pan,Pier Luigi Martelli,Jihong Guan,Yang Wang,Shuigeng Zhou

doi:10.1093/bioinformatics/btad473

Yuliang Pan, Pier Luigi Martelli + Show 3 more

Open Access

https://doi.org/10.1093/bioinformatics/btad473

Copy DOI

Abstract

Protein complexes are groups of polypeptide chains linked by non-covalent protein-protein interactions, which play important roles in biological systems and perform numerous functions, including DNA transcription, mRNA translation, and signal transduction. In the past decade, a number of computational methods have been developed to identify protein complexes from protein interaction networks by mining dense subnetworks or subgraphs. In this article, different from the existing works, we propose a novel approach for this task based on generative adversarial networks, which is called PCGAN, meaning identifying Protein Complexes by GAN. With the help of some real complexes as training samples, our method can learn a model to generate new complexes from a protein interaction network. To effectively support model training and testing, we construct two more comprehensive and reliable protein interaction networks and a larger gold standard complex set by merging existing ones of the same organism (including human and yeast). Extensive comparison studies indicate that our method is superior to existing protein complex identification methods in terms of various performance metrics. Furthermore, functional enrichment analysis shows that the identified complexes are of high biological significance, which indicates that these generated protein complexes are very possibly real complexes. https://github.com/yul-pan/PCGAN.

Full Text