Abstract

The first step in homology modeling is to identify a template protein for the target sequence. The template structure is used in later phases of the calculation to construct an atomically detailed model for the target. We have built from the Protein Data Bank (PDB) a large-scale learning set that includes tens of millions of pair matches that can be either a true template or a false one. Discriminatory learning (learning from positive and negative examples) is used to train a decision tree. Each branch of the tree is a mathematical programming model. The decision tree is tested on an independent set from PDB entries and on the sequences of CASP7. It provides significant enrichment of true templates (between 50 and 100%) when compared to PSI-BLAST. The model is further verified by building atomically detailed structures for each of the tentative true templates with modeller. The probability that a true match does not yield an acceptable structural model (within 6 A RMSD from the native structure) decays linearly as a function of the TM structural-alignment score.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.