Abstract

Decreasing sequencing costs have driven a rapid expansion of novel genotyping methods. One of these methods is the exploitation of restriction enzyme cut sites to generate genome-wide but reduced representation sequencing libraries (RRLs), alternatively termed genotyping by sequencing or restriction-site associated DNA sequencing. Without a reference genome, the resulting short sequence reads must be assembled de novo. There are many possible assembly programs, most not explicitly developed for RRL data, and we know little of their effectiveness. In this issue of Molecular Ecology Resources, LaCava et al. (2020) systematically evaluate six commonly used programs and two commonly varied parameters for complete and accurate assembly of RRLs, using simulated double digests of Homo sapiens and Arabidopsis thaliana genomes with varied mutation rates and types. The authors find substantial variation in performance across assembly programs. The most consistently high-performing assembler is infrequently used in their literature survey (CD-HIT; Li and Godzik, 2006), while several others fail to produce complete, accurate assemblies under many conditions. LaCava et al. additionally recommend best practices in parameter choice and evaluation of future assembly programs-advice that molecular ecologists working to assemble sequences of all kinds should take to heart.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call