Abstract
Advances in proteomics technologies have enabled novel protein interactions to be detected at high speed, but they come at the expense of relatively low quality. Therefore, a crucial step in utilizing the high throughput protein interaction data is evaluating their confidence and then separating the subsets of reliable interactions from the background noise for further analyses. Using Bayesian network approaches, we combine multiple heterogeneous biological evidences, including model organism protein-protein interaction, interaction domain, functional annotation, gene expression, genome context, and network topology structure, to assign reliability to the human protein-protein interactions identified by high throughput experiments. This method shows high sensitivity and specificity to predict true interactions from the human high throughput protein-protein interaction data sets. This method has been developed into an on-line confidence scoring system specifically for the human high throughput protein-protein interactions. Users may submit their protein-protein interaction data on line, and the detailed information about the supporting evidence for query interactions together with the confidence scores will be returned. The Web interface of PRINCESS (protein interaction confidence evaluation system with multiple data sources) is available at the website of China Human Proteome Organisation.
Highlights
Advances in proteomics technologies have enabled novel protein interactions to be detected at high speed, but they come at the expense of relatively low quality
The remaining set is used as the test data set to count the number of predicted true positives (TP) and false positives (FP) where one protein pair is predicted to be positive if its likelihood ratio exceeds a particular cutoff, LRcutoff, and to be negative otherwise
Six Types of Biological Evidences Can Be Used to Assess the Confidence of Protein Interactions—We use the golden standard positive and negative data sets to measure the reliability of each biological evidence
Summary
The main strategy of PRINCESS is to use likelihood ratios to assess the reliability of individual biological evidences based on golden standard data sets and to combine these individual likelihood ratios by a Bayesian model to assign confidence scores to the high throughput protein interactions (Fig. 1)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.