GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions

Junsu Ko,Hahnbeom Park,Chaok Seok

doi:10.1186/1471-2105-13-198

Junsu Ko, Hahnbeom Park + Show 1 more

Open Access

https://doi.org/10.1186/1471-2105-13-198

Copy DOI

Journal: BMC bioinformatics	Publication Date: Aug 10, 2012
Citations: 125	License type: cc-by

Affiliation: Seoul National University

Abstract

BackgroundProtein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates.ResultsWe introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on “Seok-server,” which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods.ConclusionApplication of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.

Highlights

Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available
Progress in computational protein structure prediction has been boosted by methodological improvements in the technique called template-based modeling (TBM), which uses experimental structures of homologous proteins as templates
Template-based modeling, called homology modeling or comparative modeling, generally consists of the following steps [1,2]: (1) identification of homologous proteins with known structures to be used as templates; (2) alignment of the sequences of the target and templates; (3) creation of model structures from the alignment; and (4) refinement of the models

Summary

Introduction

Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. It is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. One of the important challenges is how to optimally combine information from multiple templates to build a single model when experimental structures of multiple homologues are available. Since the average quality of multiple templates is bound to be worse than that of the single best template, using multiple templates is associated with a rather large risk of contaminating reliable information from the best template. Most of them heavily rely on a single top template while additional templates are used to fill the gaps not covered by the top template [3,9]

Methods

Results

Conclusion