Balancing multiple objectives in conformation sampling to control decoy diversity in template-free protein structure prediction

Ahmed Bin Zaman,Amarda Shehu

doi:10.1186/s12859-019-2794-5

Ahmed Bin Zaman, Amarda Shehu

Open Access

PDF Available

https://doi.org/10.1186/s12859-019-2794-5

Copy DOI

Export

Save

Cite

Journal: BMC Bioinformatics	Publication Date: Apr 25, 2019
Citations: 24	License type: open-access

Affiliation: George Mason University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundComputational approaches for the determination of biologically-active/native three-dimensional structures of proteins with novel sequences have to handle several challenges. The (conformation) space of possible three-dimensional spatial arrangements of the chain of amino acids that constitute a protein molecule is vast and high-dimensional. Exploration of the conformation spaces is performed in a sampling-based manner and is biased by the internal energy that sums atomic interactions. Even state-of-the-art energy functions that quantify such interactions are inherently inaccurate and associate with protein conformation spaces overly rugged energy surfaces riddled with artifact local minima. The response to these challenges in template-free protein structure prediction is to generate large numbers of low-energy conformations (also referred to as decoys) as a way of increasing the likelihood of having a diverse decoy dataset that covers a sufficient number of local minima possibly housing near-native conformations.ResultsIn this paper we pursue a complementary approach and propose to directly control the diversity of generated decoys. Inspired by hard optimization problems in high-dimensional and non-linear variable spaces, we propose that conformation sampling for decoy generation is more naturally framed as a multi-objective optimization problem. We demonstrate that mechanisms inherent to evolutionary search techniques facilitate such framing and allow balancing multiple objectives in protein conformation sampling. We showcase here an operationalization of this idea via a novel evolutionary algorithm that has high exploration capability and is also able to access lower-energy regions of the energy landscape of a given protein with similar or better proximity to the known native structure than several state-of-the-art decoy generation algorithms.ConclusionsThe presented results constitute a promising research direction in improving decoy generation for template-free protein structure prediction with regards to balancing of multiple conflicting objectives under an optimization framework. Future work will consider additional optimization objectives and variants of improvement and selection operators to apportion a fixed computational budget. Of particular interest are directions of research that attenuate dependence on protein energy models.

Highlights

Computational approaches for the determination of biologically-active/native three-dimensional structures of proteins with novel sequences have to handle several challenges
Even state-of-the-art energy functions that quantify atomic interactions in a conformation are inherently inaccurate; they result in overly rugged energy surfaces that are riddled with artifact local minima [9]
Inspired by hard optimization problems in high-dimensional and non-linear variable spaces, we propose that conformation sampling for decoy generation is more naturally framed as a multi-objective optimization problem

Summary

Introduction

Computational approaches for the determination of biologically-active/native three-dimensional structures of proteins with novel sequences have to handle several challenges. Even state-of-the-art energy functions that quantify such interactions are inherently inaccurate and associate with protein conformation spaces overly rugged energy surfaces riddled with artifact local minima The response to these challenges in template-free protein structure prediction is to generate large numbers of low-energy conformations ( referred to as decoys) as a way of increasing the likelihood of having a diverse decoy dataset that covers a sufficient number of local minima possibly housing near-native conformations. The space of possible three-dimensional spatial arrangements of the chain of amino acids that constitute a protein molecule is vast and high-dimensional; we refer to this space as conformation space to recognize choices in the computational representation of a structure1 Exploration of such complex spaces is performed in a sampling-based manner (most commonly under the Metropolis Monte Carlo – MMC framework) and is biased by the internal energy that sums atomic interactions. Even state-of-the-art energy functions that quantify atomic interactions in a conformation are inherently inaccurate; they result in overly rugged energy surfaces (associated with protein conformation spaces) that are riddled with artifact local minima [9]

Methods

Results

Conclusion