An integrated machine learning system to computationally screen protein databases for Protein binding peptide ligands.

Chen Shao,Ling Zhang,Youhe Gao

doi:10.1096/fasebj.20.4.a528-g

Abstract

A fairly large set of protein interactions are mediated by families of peptide binding domains, such as SH2, SH3, PDZ, and MHC etc. To identify their ligands by experimental screening is not only labor intensive but almost futile in screening low abundant species, due to the suppression of high abundant species. The ideal way of studying protein-protein interactions is to use high-throughput computational approaches to screen protein sequence databases, direct the validating experiments towards the most promising peptides. Predictors with only good cross validation were not good enough to screen protein database. In this method, only information relevant to interaction was extracted; a family of domains and their ligands were collected and aligned respectively, then combined into a prediction system. An integrated machine learning systems was built using three novel coding methods, and screened the Swissprot and Genbank protein database for ligands of 10 SH3 and 3 PDZ domains. A large proportion of predictions have already been experimentally confirmed by other independent research groups, indicating a satisfying generalization capability in protein interaction identification.

Highlights

A fairly large set of protein interactions is mediated by families of peptide binding domains, such as Src homology 2 (SH2), SH3, PDZ, major histocompatibility complex, etc
Machine learning approaches like artificial neural network [11, 12] and support vector machine (SVM) [13, 14] have been used in predicting precision; ESP, estimated screening precision; MHC, major histocompatibility complex; MCC, Matthews correlation coefficient; BLU, Boehringer light unit
To illustrate the generalization capability of our method on different classes of ligands, we show the results of three domains as examples (Table IV)

Summary

Introduction

A fairly large set of protein interactions is mediated by families of peptide binding domains, such as Src homology 2 (SH2), SH3, PDZ, major histocompatibility complex, etc. To identify their ligands by experimental screening is labor-intensive but almost futile in screening low abundance species due to the suppression by high abundance species. A more plausible way of studying protein-protein interactions is by using high throughput computational predictions rather than experimental approaches to screen for interactions from protein sequence databases to direct the validating experiments toward the most promising peptides. Machine learning approaches like artificial neural network [11, 12] and support vector machine (SVM) [13, 14] have been used in predicting precision; ESP, estimated screening precision; MHC, major histocompatibility complex; MCC, Matthews correlation coefficient; BLU, Boehringer light unit

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An integrated machine learning system to computationally screen protein databases for Protein binding peptide ligands.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The FASEB Journal

Lead the way for us

Journal: The FASEB Journal	Publication Date: Mar 1, 2006
License type: cc-by

Similar Papers

An Integrated Machine Learning System to Computationally Screen Protein Databases for Protein Binding Peptide Ligands
Ling Zhang ... Youhe Gao
Molecular & Cellular Proteomics | VOL. 5
Ling Zhang, et. al.Ling Zhang ... Youhe Gao
01 Jul 2006
Molecular & Cellular Proteomics | VOL. 5

Characterizing Binding Properties of Protein Interaction Domain
Y Gao
Journal of Proteomics & Bioinformatics | VOL. S2
Y GaoY Gao
01 Jul 2008
Journal of Proteomics & Bioinformatics | VOL. S2

Genetically Encoded Residue-Selective Photo-Crosslinker to Capture Protein-Protein Interactions in Living Cells
Wei Hu ... Xiao-Hua Chen
Chem | VOL. 5
Wei Hu, et. al.Wei Hu ... Xiao-Hua Chen
23 Sep 2019
Chem | VOL. 5

PDZ domains: troubles in classification
Paola Vaccaro ... Luciana Dente
FEBS Letters | VOL. 512
Paola Vaccaro, et. al.Paola Vaccaro ... Luciana Dente
13 Jan 2002
FEBS Letters | VOL. 512

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An integrated machine learning system to computationally screen protein databases for Protein binding peptide ligands.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The FASEB Journal