Abstract

Advances in proteomics technologies have enabled novel protein interactions to be detected at high speed, but they come at the expense of relatively low quality. Therefore, a crucial step in utilizing the high throughput protein interaction data is evaluating their confidence and then separating the subsets of reliable interactions from the background noise for further analyses. Using Bayesian network approaches, we combine multiple heterogeneous biological evidences, including model organism protein-protein interaction, interaction domain, functional annotation, gene expression, genome context, and network topology structure, to assign reliability to the human protein-protein interactions identified by high throughput experiments. This method shows high sensitivity and specificity to predict true interactions from the human high throughput protein-protein interaction data sets. This method has been developed into an on-line confidence scoring system specifically for the human high throughput protein-protein interactions. Users may submit their protein-protein interaction data on line, and the detailed information about the supporting evidence for query interactions together with the confidence scores will be returned. The Web interface of PRINCESS (protein interaction confidence evaluation system with multiple data sources) is available at the website of China Human Proteome Organisation.

Highlights

  • Advances in proteomics technologies have enabled novel protein interactions to be detected at high speed, but they come at the expense of relatively low quality

  • The remaining set is used as the test data set to count the number of predicted true positives (TP) and false positives (FP) where one protein pair is predicted to be positive if its likelihood ratio exceeds a particular cutoff, LRcutoff, and to be negative otherwise

  • Six Types of Biological Evidences Can Be Used to Assess the Confidence of Protein Interactions—We use the golden standard positive and negative data sets to measure the reliability of each biological evidence

Read more

Summary

EXPERIMENTAL PROCEDURES

The main strategy of PRINCESS is to use likelihood ratios to assess the reliability of individual biological evidences based on golden standard data sets and to combine these individual likelihood ratios by a Bayesian model to assign confidence scores to the high throughput protein interactions (Fig. 1)

Golden Standard Data Sets
Construction of Multiple Types of Biological Evidences
Combining the Confidence Scores from Individual Evidences by Bayesian Rules
RESULTS
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call