Investigating selection on viruses: a statistical alignment approach

Saskia De Groot,Jotun Hein,Gerton Lunter,Thomas Mailund

doi:10.1186/1471-2105-9-304

Saskia De Groot, Jotun Hein + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-9-304

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jul 10, 2008
Citations: 31	License type: CC BY 2.0

Affiliation: University of Oxford, Aarhus University

Abstract

BackgroundTwo problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame. Secondly, the high mutation rates we are likely to encounter complicate the inference of a reliable alignment of genomes. To address these issues, we develop a model that explicitly models selection in overlapping reading frames. We then integrate this model into a statistical alignment framework, enabling us to estimate selection while explicitly dealing with the uncertainty of individual alignments. We show that in this way we obtain un-biased selection parameters for different genomic regions of interest, and can improve in accuracy compared to using a fixed alignment.ResultsWe run a series of simulation studies to gauge how well we do in selection estimation, especially in comparison to the use of a fixed alignment. We show that the standard practice of using a ClustalW alignment can lead to considerable biases and that estimation accuracy increases substantially when explicitly integrating over the uncertainty in inferred alignments. We even manage to compete favourably for general evolutionary distances with an alignment produced by GenAl. We subsequently run our method on HIV2 and Hepatitis B sequences.ConclusionWe propose that marginalizing over all alignments, as opposed to using a fixed one, should be considered in any parametric inference from divergent sequence data for which the alignments are not known with certainty. Moreover, we discover in HIV2 that double coding regions appear to be under less stringent selection than single coding ones. Additionally, there appears to be evidence for differential selection, where one overlapping reading frame is under positive and the other under negative selection.

Highlights

Two problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame
We propose that marginalizing over all alignments, as opposed to using a fixed one, should be considered in any parametric inference from divergent sequence data for which the alignments are not known with certainty
We discover in HIV2 that double coding regions appear to be under less stringent selection than single coding ones

Summary

Introduction

Two problems complicate the study of selection in viral genomes: Firstly, the presence of genes in overlapping reading frames implies that selection in one reading frame can bias our estimates of neutral mutation rates in another reading frame. The high mutation rates we are likely to encounter complicate the inference of a reliable alignment of genomes. To address these issues, we develop a model that explicitly models selection in overlapping reading frames. Since the submission of the first SARS genome in May 2003, over 140 more have been published With this genomic data at hand we hope to be able to tackle our understanding of viruses. A step towards this is our attempt to develop a method which can deal with the vast amount of viral data, as well as the complexity of viral genomes and their high divergence and subsequent unreliability of alignment

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigating selection on viruses: a statistical alignment approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Migration as a factor in venereal disease programmes in the United States.
W J Brown
The British journal of venereal diseases | VOL. 36
W J BrownW J Brown
01 Mar 1960
The British journal of venereal diseases | VOL. 36

Experimental estimates of germline mutation rate in eukaryotes: a phylogenetic meta-analysis.
Yiguan Wang ... Darren J Obbard
Evolution Letters | VOL. 7
Yiguan Wang, et. al.Yiguan Wang ... Darren J Obbard
19 Jun 2023
Evolution Letters | VOL. 7

Empirical estimates of the mutation rate for an alphabaculovirus
Harmit S Malik ... Mark P Zwart
-
Harmit S Malik, et. al.Harmit S Malik ... Mark P Zwart
06 Jun 2022
06 Jun 2022

Empirical estimates of the mutation rate for an alphabaculovirus.
Dieke Boezen ... Wopke Van Der Werf
PLOS Genetics | VOL. 18
Dieke Boezen, et. al.Dieke Boezen ... Wopke Van Der Werf
06 Jun 2022
PLOS Genetics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigating selection on viruses: a statistical alignment approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics