Abstract

A general solution to the problem of directly incorporating data from multiple sequence alignments into the construction of molecular models was approached through the calculation of an estimated pairwise distance based on conserved hydrophobicity. A scaling method was developed that allowed the required bulk geometric properties of the estimated pair-wise distances (mean and mean squared) to mimic those expected in a globular protein. These properties were maintained independently of the composition, length, number or degree of conservation of the original sequences. Despite being a poor estimate for individual distances were found to be compatible with the native structure and could be weighted highly. While the estimated distances provided a general drive towards hydrophobic packing, more specific structure (including secondary structures and motifs) were induced by regularization towards an ideal form. These constraints were used to refine an outline starting structure (derived only from secondary structure axes) towards a compact form that was sufficiently protein-like for side chains to be added with almost no further adjustment of the alpha-carbon positions. This process allows rough folds based on abstract representations of protein architecture to be rapidly converted to a form where they can be analysed by the growing number of methods designed to assess molecular models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call