Abstract
Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.
Highlights
IntroductionThe conformations in which the sequence of amino acids that constitute a protein molecule fold in three-dimensional space are central to its biological activities in the cell
Protein molecules control virtually all processes that maintain and replicate a living cell.The conformations in which the sequence of amino acids that constitute a protein molecule fold in three-dimensional space are central to its biological activities in the cell
We look deeper into the basins selected by Basin-Pareto Rank (PR) and Basin-PR+Pareto Count (PC) on easy, medium, and hard cases with Protein Data Bank (PDB) entries 1wapa, 1bq9, and 2ezk
Summary
The conformations in which the sequence of amino acids that constitute a protein molecule fold in three-dimensional space are central to its biological activities in the cell. Due to the central role that conformations of a protein play in governing recognition events, significant efforts in wet laboratories are devoted to determination of biologically-active conformations as a means of decoding protein function. This task is growing in urgency due to the millions of uncharacterized protein-encoding gene sequences deposited in genomic databases by increasingly faster and less expensive high-throughput gene sequencing technologies [2].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.