Abstract

BackgroundA key challenge in the field of HIV-1 protein evolution is the identification of coevolving amino acids at the molecular level. In the past decades, many sequence-based methods have been designed to detect position-specific coevolution within and between different proteins. However, an ensemble coevolution system that integrates different methods to improve the detection of HIV-1 protein coevolution has not been developed.ResultsWe integrated 27 sequence-based prediction methods published between 2004 and 2013 into an ensemble coevolution system. This system allowed combinations of different sequence-based methods for coevolution predictions. Using HIV-1 protein structures and experimental data, we evaluated the performance of individual and combined sequence-based methods in the prediction of HIV-1 intra- and inter-protein coevolution. We showed that sequence-based methods clustered according to their methodology, and a combination of four methods outperformed any of the 27 individual methods. This four-method combination estimated that HIV-1 intra-protein coevolving positions were mainly located in functional domains and physically contacted with each other in the protein tertiary structures. In the analysis of HIV-1 inter-protein coevolving positions between Gag and protease, protease drug resistance positions near the active site mostly coevolved with Gag cleavage positions (V128, S373-T375, A431, F448-P453) and Gag C-terminal positions (S489-Q500) under selective pressure of protease inhibitors.ConclusionsThis study presents a new ensemble coevolution system which detects position-specific coevolution using combinations of 27 different sequence-based methods. Our findings highlight key coevolving residues within HIV-1 structural proteins and between Gag and protease, shedding light on HIV-1 intra- and inter-protein coevolution.ReviewersThis article was reviewed by Dr. Zoltán Gáspári.Electronic supplementary materialThe online version of this article (doi:10.1186/s13062-014-0031-8) contains supplementary material, which is available to authorized users.

Highlights

  • A key challenge in the field of HIV-1 protein evolution is the identification of coevolving amino acids at the molecular level

  • Sequence data obtained from patients receiving protease inhibitor (PI) treatment were used to detect inter-protein statistical couplings given that HIV-1 clinical datasets with PI treatment information were used for evaluation

  • Thereafter, we designed a heuristic algorithm to optimize the combination of sequence-based methods, which were evaluated by area under the precision-recall curve (AUC)

Read more

Summary

Introduction

A key challenge in the field of HIV-1 protein evolution is the identification of coevolving amino acids at the molecular level. Many sequence-based methods have been designed to detect positionspecific coevolution within and between different proteins. Many studies have revealed position-specific coevolution in HIV-1 proteins using sequence-based methods. To model coevolution within and between proteins [11,14,15], position-specific sequence analysis has been used to detect pairs of correlated amino acid positions, so-called statistical couplings [16] ( called co-variations [17] or correlated substitutions [18]). Thanks to the increase of crystalized structures in public databases, the performance of sequence-based methods is usually evaluated based on structural information, such as protein contact map [25], because spatially proximate positions tend to coevolve [26] and sequence evolution is associated with structural dynamics [27]. State-of-the-art methods in different studies showed significant variability, while evaluation of longrange coevolving residues continues to be difficult in most scenarios [15,22,24]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call