Abstract

Next-generation sequencing technology has increased the capacity to generate molecular data for plant biological research, including phylogenetics, and can potentially contribute to resolving complex phylogenetic problems. The evolutionary history of Medicago L. (Leguminosae: Trifoliae) remains unresolved due to incongruence between published phylogenies. Identification of the processes causing this genealogical incongruence is essential for the inference of a correct species phylogeny of the genus and requires that more molecular data, preferably from low-copy nuclear genes, are obtained across different species. Here we report the development of 50 novel LCN markers in Medicago and assess the phylogenetic properties of each marker. We used the genomic resources available for Medicago truncatula Gaertn., hybridisation-based gene enrichment (sequence capture) techniques and Next-Generation Sequencing to generate sequences. This alternative proves to be a cost-effective approach to amplicon sequencing in phylogenetic studies at the genus or tribe level and allows for an increase in number and size of targeted loci. Substitution rate estimates for each of the 50 loci are provided, and an overview of the variation in substitution rates among a large number of low-copy nuclear genes in plants is presented for the first time. Aligned sequences of major species lineages of Medicago and its sister genus are made available and can be used in further probe development for sequence-capture of the same markers.

Highlights

  • The development and rapidly growing capacity of next-generation sequencing (NGS) has greatly increased the amount of data generated for research in plant biology

  • Genes were chosen according to the following criteria and parameters: distance between each linked gene within groups < 30Kbp, length of genes ! 2Kbp; introns 500bp, genes single-copy within the M. truncatula genome, genes with homologues in other genomes (e.g., Lotus L., Glycine Willd., Populus L., Arabidopsis Heynh.), genes expressed in any part of M. truncatula

  • Recovery of contigs through our bioinformatic protocol failed for two of the 62 selected genes. The causes of this failure are unclear since no differences in terms of size, proportion of exon and intron, GC content, or expression levels were detected between these two genes and the remaining 60

Read more

Summary

Introduction

The development and rapidly growing capacity of next-generation sequencing (NGS) has greatly increased the amount of data generated for research in plant biology. Large datasets of molecular sequences are being collected across various model and non-model organisms by sequencing whole genomes, transcriptomes, or through enrichment of multiple genes at PLoS ONE | DOI:10.1371/journal.pone.0109704. Lundgrenska fund to B.E.P.; from the P. A. Larssons fund and Lars Hiertas Minne fund to F.S. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call