Abstract

The barcoding of the mitochondrial COX1 gene has been instrumental in cataloguing the tree of life, and in providing insights in the phylogeographic history of species. Yet, this strategy has encountered difficulties in major clades characterized by large genomes, which contain a high frequency of nuclear pseudogenes originating from the mitochondrial genome (numts). Here, we use the meadow grasshopper (Chorthippus parallelus), which possesses a giant genome of ~13 Gb, to identify mitochondrial genes that are underrepresented as numts, and test their use as informative phylogeographic markers. We recover the same full mitochondrial sequence using both whole genome and transcriptome sequencing, including functional protein-coding genes and tRNAs. We show that a region of the mitogenome containing the COX1 gene, typically used in DNA barcoding, has disproportionally higher diversity and coverage than the rest of the mitogenome, consistent with multiple insertions of that region into the nuclear genome. By designing new markers in regions of less elevated diversity and coverage, we identify two mitochondrial genes that are less likely to be duplicated as numts. We show that, while these markers show high levels of incomplete lineage sorting between subspecies, as expected for mitochondrial genes, genetic variation reflects their phylogeographic history accurately. These findings allow us to identify useful mitochondrial markers for future studies in C. parallelus, an important biological system for evolutionary biology. More generally, this study exemplifies how non-PCR-based methods using next-generation sequencing can be used to avoid numts in species characterized by large genomes, which have remained challenging to study in taxonomy and evolution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call