Abstract

Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes' genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons.

Highlights

  • Mutation clusters consisting of multiple nearby substitutions and indels in sequence alignments are pervasive throughout eukaryotic genomes [1]

  • To model sequence evolution according to just single base substitutions and indels, and sequence evolution which incorporates template switch events, we implemented two probabilistic models: a canonical three-state pair hidden Markov model for

  • Short-range template switching in the great apes linear pairwise sequence alignment, and a seven-state pair hidden Markov model (pairHMM)-like model that incorporates a single region of reverse complement alignment which corresponds to a candidate template switch event

Read more

Summary

Introduction

Mutation clusters consisting of multiple nearby substitutions and indels (insertions and deletions) in sequence alignments are pervasive throughout eukaryotic genomes [1] These complex mutation patterns might arise through either a process of random, independent mutation accumulation within a small sequence window, or single mutational events capable of generating many apparent substitutions and indels in a single pass. Methods for inferring adaptive evolution such as the widely used branch-site test rely on likelihood ratio testing, for which a core assumption is that substitutions occur independently and at single sites [3, 4] When these assumptions are violated, and the branch-site test is applied to regions subject to multinucleotide mutations, false inferences of positive selection are produced [5]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.