Abstract

BackgroundAccurate prediction of protein structure is fundamentally important to understand biological function of proteins. Template-based modeling, including protein threading and homology modeling, is a popular method for protein tertiary structure prediction. However, accurate template-query alignment and template selection are still very challenging, especially for the proteins with only distant homologs available.ResultsWe propose a new template-based modelling method called ThreaderAI to improve protein tertiary structure prediction. ThreaderAI formulates the task of aligning query sequence with template as the classical pixel classification problem in computer vision and naturally applies deep residual neural network in prediction. ThreaderAI first employs deep learning to predict residue-residue aligning probability matrix by integrating sequence profile, predicted sequential structural features, and predicted residue-residue contacts, and then builds template-query alignment by applying a dynamic programming algorithm on the probability matrix. We evaluated our methods both in generating accurate template-query alignment and protein threading. Experimental results show that ThreaderAI outperforms currently popular template-based modelling methods HHpred, CNFpred, and the latest contact-assisted method CEthreader, especially on the proteins that do not have close homologs with known structures. In particular, in terms of alignment accuracy measured with TM-score, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 56, 13, and 11%, respectively, on template-query pairs at the similarity of fold level from SCOPe data. And on CASP13’s TBM-hard data, ThreaderAI outperforms HHpred, CNFpred, and CEthreader by 16, 9 and 8% in terms of TM-score, respectively.ConclusionsThese results demonstrate that with the help of deep learning, ThreaderAI can significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins.

Highlights

  • Accurate prediction of protein structure is fundamentally important to understand biological function of proteins

  • It remains to be very challenging for template-based modelling (TBM) methods to predict structures accurately when only remote homologs which are conserved in structure but share low sequence similarity with query are available in structure library [5,6,7]

  • We present a new method, called ThreaderAI, which uses a deep residual neural network to perform template-query alignment

Read more

Summary

Introduction

Accurate prediction of protein structure is fundamentally important to understand biological function of proteins. Recent progress in protein structure prediction showed that with the help of deep learning, it’s possible for free modelling (FM) methods to generate fold-level accuracy models of proteins lacking homologs in protein structure library [1,2,3,4] As both protein sequence and structure databases expand, template-based modelling (TBM) methods remain to be very popular and useful [5,6,7] for the proteins with homologs available in protein structure library. The quality of TBM prediction critically relies on template-query alignment and template selection It remains to be very challenging for TBM methods to predict structures accurately when only remote homologs which are conserved in structure but share low sequence similarity with query are available in structure library [5,6,7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call