Abstract
BackgroundProtein-protein interactions (PPIs) are a critical component for many underlying biological processes. A PPI network can provide insight into the mechanisms of these processes, as well as the relationships among different proteins and toxicants that are potentially involved in the processes. There are many PPI databases publicly available, each with a specific focus. The challenge is how to effectively combine their contents to generate a robust and biologically relevant PPI network.MethodsIn this study, seven public PPI databases, BioGRID, DIP, HPRD, IntAct, MINT, REACTOME, and SPIKE, were used to explore a powerful approach to combine multiple PPI databases for an integrated PPI network. We developed a novel method called k-votes to create seven different integrated networks by using values of k ranging from 1-7. Functional modules were mined by using SCAN, a Structural Clustering Algorithm for Networks. Overall module qualities were evaluated for each integrated network using the following statistical and biological measures: (1) modularity, (2) similarity-based modularity, (3) clustering score, and (4) enrichment.ResultsEach integrated human PPI network was constructed based on the number of votes (k) for a particular interaction from the committee of the original seven PPI databases. The performance of functional modules obtained by SCAN from each integrated network was evaluated. The optimal value for k was determined by the functional module analysis. Our results demonstrate that the k-votes method outperforms the traditional union approach in terms of both statistical significance and biological meaning. The best network is achieved at k=2, which is composed of interactions that are confirmed in at least two PPI databases. In contrast, the traditional union approach yields an integrated network that consists of all interactions of seven PPI databases, which might be subject to high false positives.ConclusionsWe determined that the k-votes method for constructing a robust PPI network by integrating multiple public databases outperforms previously reported approaches and that a value of k=2 provides the best results. The developed strategies for combining databases show promise in the advancement of network construction and modeling.
Highlights
Protein-protein interactions (PPIs) are a critical component for many underlying biological processes
Protein-protein interaction (PPI) is a critical component of almost every biological process related to physiological conditions, and can be analyzed in a PPI network to discover underlying mechanisms of toxicity and disease at the integrated system level [1]
Seven PPI databases were preprocessed so that only human data were selected by using unified EntrezGeneIDs
Summary
Protein-protein interactions (PPIs) are a critical component for many underlying biological processes. Protein-protein interaction (PPI) is a critical component of almost every biological process related to physiological conditions, and can be analyzed in a PPI network to discover underlying mechanisms of toxicity and disease at the integrated system level [1]. Many current studies focus on how to mine disease-related genes/proteins to provide a better understanding of the mechanisms of diseases by using PPI databases; the hypothesis is that genes related to the same disease tend to encode proteins that interact with each other [2]. PPI data are crucial for new disease biomarker discovery, disease-disease relationship searching, and common biological function detection. Developing methods to integrate PPI databases and construct a robust and biologically relevant PPI network is of great importance. The question is how to combine multiple PPI databases so that the best integrated PPI network can be established
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.