Abstract

MotivationProtein fold recognition when appropriate, evolutionarily-related, structural templates can be identified is often trivial and may even be viewed as a solved problem. However in cases where no homologous structural templates can be detected, fold recognition is a notoriously difficult problem (Moult et al., 2014). Here we present EigenTHREADER, a novel fold recognition method capable of identifying folds where no homologous structures can be identified. EigenTHREADER takes a query amino acid sequence, generates a map of intra-residue contacts, and then searches a library of contact maps of known structures. To allow the contact maps to be compared, we use eigenvector decomposition to resolve the principal eigenvectors these can then be aligned using standard dynamic programming algorithms. The approach is similar to the Al-Eigen approach of Di Lena et al. (2010), but with improvements made both to speed and accuracy. With this search strategy, EigenTHREADER does not depend directly on sequence homology between the target protein and entries in the fold library to generate models. This in turn enables EigenTHREADER to correctly identify analogous folds where little or no sequence homology information is.ResultsEigenTHREADER outperforms well-established fold recognition methods such as pGenTHREADER and HHSearch in terms of True Positive Rate in the difficult task of analogous fold recognition. This should allow template-based modelling to be extended to many new protein families that were previously intractable to homology based fold recognition methods.Availability and implementationAll code used to generate these results and the computational protocol can be downloaded from https://github.com/DanBuchan/eigen_scripts. EigenTHREADER, the benchmark code and the data this paper is based on can be downloaded from: http://bioinfadmin.cs.ucl.ac.uk/downloads/eigenTHREADER/.

Highlights

  • Accurate prediction of protein structure from protein sequence remains a significant open problem in structural biology and bioinformatics, and this topic has received a great deal of attention in the preceding 50 years

  • Alongside the EigenTHREADER runtimes we show the estimated runtimes for Al-eigen given the exponential increase in runtime reported in the work of Di Lena et al It is clear that EigenTHREADER represents a substantial increase in performance

  • Recognition of analogous folds, where no homologues exists in the fold library, is anything but a solved problem

Read more

Summary

Introduction

Accurate prediction of protein structure from protein sequence remains a significant open problem in structural biology and bioinformatics, and this topic has received a great deal of attention in the preceding 50 years. Template-free or ab initio folding attempts to fold proteins using only the physiochemical information implicit in the protein sequence itself. To date, such methods have achieved rather limited success (Moult, Fidelis et al 2014), though recent developments in protein contact prediction are look very promising. The alternative strategy, template based (or homology) modelling, is widely used by biologists as it has proven to be a robust predictive strategy, enjoying increasing success as both the sequence and structure databases expand

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call