Abstract

High-throughput sequencing has laid the foundation for fast and cost-effective development of phylogenetic markers. Here we present the program discomark, which streamlines the development of nuclear DNA (nDNA) markers from whole-genome (or whole-transcriptome) sequencing data, combining local alignment, alignment trimming, reference mapping and primer design based on multiple sequence alignments to design primer pairs from input orthologous sequences. To demonstrate the suitability of discomark, we designed markers for two groups of species, one consisting of closely related species and one group of distantly related species. For the closely related members of the species complex of Cloeon dipterum s.l. (Insecta, Ephemeroptera), the program discovered a total of 78 markers. Among these, we selected eight markers for amplification and Sanger sequencing. The exon sequence alignments (2526 base pairs) were used to reconstruct a well-supported phylogeny and to infer clearly structured haplotype networks. For the distantly related species, we designed primers for the insect order Ephemeroptera, using available genomic data from four sequenced species. We developed primer pairs for 23 markers that are designed to amplify across several families. The discomark program will enhance the development of new nDNA markers by providing a streamlined, automated approach to perform genome-scale scans for phylogenetic markers. The program is written in Python, released under a public licence (GNU GPL version 2), and together with a manual and example data set available at: https://github.com/hdetering/discomark.

Highlights

  • The inference of phylogenetic relationships has benefited profoundly from the availability of nuclear DNA sequences for an increasing number of organism groups

  • DISCOMARK is the first stand-alone program of which we are aware that discovers putative single-copy nDNA markers and designs primer pairs based on multiple sequence alignments on a genome-wide scale

  • The automatic processing, including combining, aligning, trimming and blasting sequences of any nucleotide FASTA sequences together with the produced graphical output significantly facilitate the design of primer pairs for a large number of nDNA markers

Read more

Summary

Introduction

The inference of phylogenetic relationships has benefited profoundly from the availability of nuclear DNA (nDNA) sequences for an increasing number of organism groups. For many taxonomic groups there are only a handful of nDNA markers available that are suitable for phylogenetic reconstruction. Other approaches, such as ultra-conserved element (UCE) sequencing (Faircloth et al 2012), anchored hybrid enrichment (Lemmon and Lemmon 2012), restriction site-associated DNA (RAD) sequencing (Baird et al 2008) or genotyping by sequencing (GBS, Elshire et al 2011) have become popular for addressing specific questions in systematics or population genetics; these methods are still cost-intensive, require a comparatively high amount of starting DNA material and can depend on the availability of reference genomes (e.g. anchored hybrid enrichment). Standard Sanger sequencing approaches are still in high demand for various research questions

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call