Abstract
Cliques (maximal complete subnets) in protein-protein interaction (PPI) network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO) annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP) and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.
Highlights
Identifying protein-protein interaction (PPI) and constructing biological networks are vital to understand the molecular function and cellular organization [1]
The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP) and most of the predicted interactions are verified by another biological database, BioGRID
The estimation based on gene ontology (GO) annotations can enhance the accuracy of PPI predictions [7]
Summary
Identifying protein-protein interaction (PPI) and constructing biological networks are vital to understand the molecular function and cellular organization [1]. Cliques in PPI networks are related to protein complexes and functional modules tightly and have a biological significance [6]. The estimation based on gene ontology (GO) annotations can enhance the accuracy of PPI predictions [7] This is because PPIs from cliques usually have common terms in GO annotations of cellular component (CC) or molecular function (MF), due to the correlation of cliques with complexes or functional modules. The two predicted sets are estimated with a statistical method based on gold standard datasets [9], and the results show the effectiveness of our method We introduce another dataset, BioGRID [10], recording a larger number PPIs from biological experiments to verify the correction of the predictions. Appending the predictions into the PPI network, the mined cliques are close to complete complexes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have