Abstract

Repetitive microsatellite DNA forms a universal component of eukaryote genomes and specific biochemical properties of such repeat regions may influence the outcome of laboratory protocols. The Atlantic cod (Gadus morhua) genome contains an order of magnitude more dinucleotide repeats than the majority of vertebrates, with over eight percent of its genome that can be classified as either AC or AG dinucleotide repeat. We find that the abundance of these repeats can be inflated in ancient DNA (aDNA) whole genome sequencing (WGS) data generated from this species, in particular in samples with a lower fragment length. This inflation is suppressed by a reduced number of amplification cycles and by the inclusion of manufactured dinucleotide repeat oligonucleotides during amplification. These data indicate that a biased amplification reaction leads to artificially high levels of AC and AG repeats. This process appears to be particularly efficient in Atlantic cod –likely due to its high genomic content of repeats with relatively simple sequence complexity. While the extend of such bias in other studies is unclear, we nonetheless urge caution when quantifying repeat content in aDNA WGS data, given that amplification bias can be difficult to detect if this process affects more complex repeat structures than dinucleotide repeats.

Highlights

  • Microsatellite DNA or short tandem repeats (STRs) that iterate short motifs of less than 6 base pair form a universal component of eukaryote genomes (Tautz and Renz 1984, Ellegren 2004, Amos and Clarke 2008)

  • We find that the abundance of these repeats can be inflated in ancient DNA whole genome sequencing (WGS) data generated from this species, in particular in samples with a lower fragment length

  • ADNA sequence data generated from human samples –with levels of endogenous DNA comparable to those of the Atlantic cod samples used here– do not have inflated proportions of AC or AG repeats (Supplementary Figure 3)

Read more

Summary

Introduction

Microsatellite DNA or short tandem repeats (STRs) that iterate short motifs of less than 6 base pair (bp) form a universal component of eukaryote genomes (Tautz and Renz 1984, Ellegren 2004, Amos and Clarke 2008). While compound microsatellites are found more frequently than expected by chance alone (Kofler et al 2008), the majority of microsatellites in vertebrate genomes occur as dinucleotide repeats, with AC, AG and AT being the most common type, and with GC repeats being rare (Ellegren 2004) Their widespread occurrence and high level of individual variation have made microsatellites a popular genetic tool for an impressive range of biological applications (Tautz 1989, Chambers and MacAvoy 2000), even though microsatellite evolution itself is not fully understood (Buschiazzo and Gemmell 2006, Bhargava and Fuentes 2010). The peculiar biochemical properties of microsatellites and in particular those of dinucleotide repeats are not often considered to affect whole genome sequencing (WGS) approaches

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call