Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site.

Henrik Nielsen,Søren Brunak,Gunnar Von Heijne,Jacob Engelbrecht

doi:10.1002/(sici)1097-0134(199602)24:2<165::aid-prot4>3.0.co;2-i

Abstract

When preparing data sets of amino acid or nucleotide sequences it is necessary to exclude redundant or homologous sequences in order to avoid overestimating the predictive performance of an algorithm. For some time methods for doing this have been available in the area of protein structure prediction. We have developed a similar procedure based on pair-wise alignments for sequences with functional sites. We show how a correlation coefficient between sequence similarity and functional homology can be used to compare the efficiency of different similarity measures and choose a nonarbitrary threshold value for excluding redundant sequences. The impact of the choice of scoring matrix used in the alignments is examined. We demonstrate that the parameter determining the quality of the correlation is the relative entropy of the matrix, rather than the assumed (PAM or identity) substitution mode. Results are presented for the case of prediction of cleavage sites in signal peptides. By inspection of the false positives, several errors in the database were found. The procedure presented may be used as a general outline for finding a problem-specific similarity measure and threshold value for analysis of other functional amino acid or nucleotide sequence patterns.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site.

Abstract

Talk to us

Similar Papers

More From: Proteins: Structure, Function, and Genetics

Lead the way for us

Journal: Proteins: Structure, Function, and Genetics	Publication Date: Feb 1, 1996
Citations: 99

Similar Papers

Signal peptide discrimination and cleavage site identification using SVM and NN
H.B Kazemian ... K White
Computers in Biology and Medicine | VOL. 45
H.B Kazemian, et. al.H.B Kazemian ... K White
01 Dec 2013
Computers in Biology and Medicine | VOL. 45

Production and purification of novel secreted human proteins
Timothy A Coleman ... Reiner Gentz
Gene | VOL. 190
Timothy A Coleman, et. al.Timothy A Coleman ... Reiner Gentz
01 Jan 1997
Gene | VOL. 190

Discovery and Validation of Novel Peptide Agonists for G-protein-coupled Receptors
Ronen Shemesh ... Yossi Cohen
Journal of Biological Chemistry | VOL. 283
Ronen Shemesh, et. al.Ronen Shemesh ... Yossi Cohen
01 Dec 2008
Journal of Biological Chemistry | VOL. 283

Prediction of Signal Peptide Cleavage Sites with Subsite‐Coupled and Template Matching Fusion Algorithm
Shao‐Wu Zhang ... Jun‐Nan Zhang
Molecular Informatics | VOL. 33
Shao‐Wu Zhang, et. al.Shao‐Wu Zhang ... Jun‐Nan Zhang
01 Mar 2014
Molecular Informatics | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site.

Abstract

Talk to us

Similar Papers

More From: Proteins: Structure, Function, and Genetics