Abstract

Chou, and Fasman developed the first empirical prediction method to predict secondary structure of proteins from their amino acid sequences. Subsequently, a more sophisticated GOR method has been developed. Although it became very popular among biologists, their accuracy was only slightly better than random. A significant improvement in prediction accuracy >70% has been achieved by ‘second generation’ methods such as PHD, SAM-T98, and PSIPRED, which utilized information concerning sequence conservation. Only recently F. B. Akcesme developed a local similarity based method to obtain an accuracy >90%in secondary structure prediction of any new protein. In this article we examined the possibility of sequence similarity based secondary structure prediction of proteins. To deal with this issue, all proteins of PDB dataset are searched for identical subsequences in the other larger proteins of PDB dataset. It is seen that around 17% of proteins in the PDB dataset have identical subsequences in other larger proteins of PDB dataset. When the secondary structures of proteins are assigned as the corresponding secondary structures of identical parts in other larger proteins, the average prediction accuracy is found to be 90.39 %. Therefore, we concluded that an unknown protein has a chance of 17 % to have an identical subsequence in a larger protein in Protein Data Bank (PDB), and there is a possibility that its secondary structure be predicted with around 90% accuracy with this method.

Highlights

  • For the understanding of both the mechanisms of folding and the biological function of proteins the knowledge of protein structures is essential

  • For each protein in Protein Data Bank (PDB) we find proteins that contain these proteins as a subsequence

  • We examined the issue of how far secondary structure of proteins can be predicted based on the set of solved structures currently deposited in PDB

Read more

Summary

Introduction

For the understanding of both the mechanisms of folding and the biological function of proteins the knowledge of protein structures is essential. To predict the secondary and tertiary structures of proteins, X-ray diffraction has been successfully used for many crystallized proteins. This method is highly accurate, while it is expensive and timeconsuming. Many membrane and ribosomal proteins have not yet been crystallized. It is widely believed that the native conformation of a protein is determined by its amino acid sequence(Anfinsen et al, 1961),many unsuccessful efforts have been made to predict the protein secondary and tertiary structures from the protein sequence data. Many workers used different methods to predict protein secondary structure Can/ Southeast Europe Journal of Soft Computing Vol. No.1 March 2017 (44-50)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call