Secondary structure prediction quality for naturally occurring amino acids in soluble proteins

Davor Juretić,Bono Lucic,Nenad Trinajstić

doi:10.1016/0166-1280(94)04047-v

Abstract

To judge the performance of protein secondary structure prediction it is common to use performance measures that can report the prediction accuracy for each conformation of the three-state model ( α-helix, β-sheet and loop). Much more specific performance quality factors can be associated with each amino acid type found in each conformation. Such measures are introduced in this work and used to test both weak and strong features of secondary structure prediction with neural network algorithms. Proline in the loop conformation is the best predicted amino acid conformation. At the same time proline is the worst predicted amino acid in regular secondary structures. Other helix cap residues: glycine, serine, asparagine, aspartate and histidine are also poorly predicted in regular secondary structures. The overall percentage of correct predictions ranges from 77 for methionine to 65 for cysteine. Based on these results the prediction accuracy profile can be reported as a sequence of numbers along a polypeptide sequence for each protein tested with the chosen prediction scheme. Sequence segments associated with a low prediction accuracy will indicate that a training database of proteins was not adequate for the task of predicting such segments even by using the best available pattern recognition scheme.

Full Text