Abstract

The use of next-generation DNA sequencing technologies has greatly facilitated reference-guided variant detection in complex plant genomes. However, complications may arise when regions adjacent to a read of interest are used for marker assay development, or when reference sequences are incomplete, as short reads alone may not be long enough to ascertain their uniqueness. Here, the possibility of generating longer sequences in discrete regions of the large and complex genome of maize is demonstrated, using a modified version of a paired-end RAD library construction strategy. Reads are generated from DNA fragments first digested with a methylation-sensitive restriction endonuclease, sheared, enriched with biotin and a selective PCR amplification step, and then sequenced at both ends. Sequences are locally assembled into contigs by subgrouping pairs based on the identity of the read anchored by the restriction site. This strategy applied to two maize inbred lines (B14 and B73) generated 183,609 and 129,018 contigs, respectively, out of which at least 76% were >200 bps in length. A subset of putative single nucleotide polymorphisms from contigs aligning to the B73 reference genome with at least one mismatch was resequenced, and 90% of those in B14 were confirmed, indicating that this method is a potent approach for variant detection and marker development in species with complex genomes or lacking extensive reference sequences.

Highlights

  • DNA-based genetic markers are pivotal tools for applications as diverse as QTL mapping, marker assisted selection, association mapping, and fine mapping for the detection of genes linked to a particular phenotype [1]

  • Among the variety of genetic markers that have been developed, those derived from single nucleotide polymorphisms (SNPs) have become the marker of choice for many mapping applications because of their abundance and the availability of high-throughput and cost-effective technologies for detection and diagnostics [2,3,4]

  • In a previous study [8], we developed a methodology for rapid SNP detection in rice and soybean that can be applied to a wide range of moderately or highly complex plant genomes where sufficient genomic reference sequences are available

Read more

Summary

Introduction

DNA-based genetic markers are pivotal tools for applications as diverse as QTL mapping, marker assisted selection, association mapping, and fine mapping for the detection of genes linked to a particular phenotype [1]. While the availability of a high quality reference sequence may render short reads sufficient for alignment and subsequent SNP detection, this limitation may be further compounded in crop species due to (1) the inherent complexity of genomes (and transcriptomes) in economically important species, such as maize, soybean, or canola, due to an elevation in ploidy and/or the frequent expansion of paralogous sequences and gene families, and the need to generate very long sequencing reads for resolving highly duplicated sequences within a single genome, and (2) the potentially large number of polymorphisms between lines in those same plant species (including indels and in regions flanking a polymorphism of interest) and the need to provide long line-specific sequences for identifying variants, open reading frames, or other biologically active regions for lines whose genome sequence has a significantly altered composition in comparison to the reference assembly Because of those limitations, the Roche 454 FLX platform [11] is often used as the instrument of choice for providing long sequencing reads and generating an appropriate sequencing scaffold.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call