Abstract

Proteins, the basic structural and functional building blocks of cellular machinery, evolve under significant constraint. Evolutionary pressures limit the acceptable variation and covariation of amino acids in a protein. As such, these evolutionary pressures yield constraints manifested in the sequence record of a family of evolutionarily related proteins. Some of these sequence constraints arise from within the proteins themselves (intramolecular constraints), while others are imposed externally (intermolecular constraints). This thesis addresses how to identify both intramolecular and intermolecular sequence constraints, represent these constraints with probabilistic graphical models, and use the models predictively and generatively. Our probabilistic graphical models support a general and powerful mechanism for representing and utilizing sequence constraints, providing a sound semantics, an efficient and transparent means of evaluating consistency with the underlying constraints, and an effective framework for exploring the space of satisfying sequences. For intramolecular constraints, our graphical models of residue coupling (GMRCs) identify and encode both residue conservation (individual residue variation) and residue coupling (pairwise residue covariation). Our GMRCs identify biologically relevant constraints and can be used to predict with high accuracy the functional class of a protein. Algorithms we have developed for sampling from GMRCs generate novel sequences that meet the underlying constraints. For intermolecular constraints, our graphical models of residue cross-coupling (GMRCCs) identify and encode residue cross-coupling (covariation between residues in two different proteins). Our GMRCCs identify cross-coupled residues that are known to confer specificity in protein-protein interactions. Using these constraints, our models are able to predict whether or not two proteins will interact. Algorithms we have developed are able to generate the sequences most likely to interact (or not) with particular design targets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.