Abstract

Quantitative genetic simulations can save time and resources by optimizing the logistics of an experiment. Current tools are difficult to use by those unfamiliar with programming, and these tools rarely address the actual genetic structure of the population under study. Here, we introduce crossword, which utilizes the widely available re-sequencing and genomics data to create more realistic simulations and to reduce user burden. The software was written in R, to simplify installation and implementation. Because crossword is a domain-specific language, it allows complex and unique simulations to be performed, but the language is supported by a graphical interface that guides users through functions and options. We first show crossword’s utility in QTL-seq design, where its output accurately reflects empirical data. By introducing the concept of levels to reflect family relatedness, crossword can simulate a broad range of breeding programs and crops. Using levels, we further illustrate crossword’s capabilities by examining the effect of family size and number of selfing generations on phenotyping accuracy and genomic selection. Additionally, we explore the ramifications of large phenotypic difference between parents in a QTL mapping cross, a scenario that is common in crop genetics but often difficult to simulate.

Highlights

  • The simulation of controlled crosses has been useful in comparing breeding strategies and experimental design of genetic mapping studies[2,3]

  • We describe some of hurdles to realistic breeding and genetic mapping simulations

  • The structure of founder populations is rarely as uniform as those produced by coalescent simulation. crossword uses actual structures derived from genetic variation data and supplied as VCF or Hapmap files (Fig. 2)

Read more

Summary

Introduction

The simulation of controlled crosses has been useful in comparing breeding strategies (reviewed in1) and experimental design of genetic mapping studies[2,3]. Current open-source packages have expanded the realism and utility of simulated scenarios by incorporating elements such as variation in recombination frequencies, novel transgenic approaches, and genomic selection[4,5] Still, this realism comes at a significant cost, both, in terms of computational speed and difficulty of use. The platform, called “crossword”, is essentially a domain-specific language that, when executed, is interpreted into and executed by the R statistical programming environment This layer of abstraction allows users to focus less on the mechanics of implementing the simulation and more on their actual experimental goals and/or breeding ideas. Unrealistic simplifying assumptions about founder structure can have substantial implications on predicting breeding results[6] Some of these simulators offer a means to supply parental genotypes, this functionality is very difficult to implement in practice. We report on the methods by which crossword simplifies these problems, and we give examples of how the resultant realism can be critical to major aspects of experimental design and interpretation

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call