Abstract

BackgroundTraditional Map based Cloning approaches, used for the identification of desirable alleles, are extremely labour intensive and years can elapse between the mutagenesis and the detection of the polymorphism. High throughput sequencing based Mapping-by-sequencing approach requires an ordered genome assembly and cannot be used with fragmented, un-scaffolded draft genomes, limiting its application to model species and precluding many important organisms.ResultsWe addressed this gap in resource and presented a computational method and software implementations called CHERIPIC (Computing Homozygosity Enriched Regions In genomes to Prioritise Identification of Candidate variants). We have successfully validated implementation of CHERIPIC using three different types of bulk segregant sequence data from Arabidopsis, maize and barley, respectively.ConclusionsCHERIPIC allows users to rapidly analyse bulk segregant sequence data and we have made it available as a pre-packaged binary with all dependencies for Linux and MacOS and as Galaxy tool.

Highlights

  • Traditional Map based Cloning approaches, used for the identification of desirable alleles, are extremely labour intensive and years can elapse between the mutagenesis and the detection of the polymorphism

  • For a recessive candidate in mutant bulks we expect an allele frequency close to 100%, while the allele frequency in background bulk would be around 33.3%, these percentages allow tuning of the identification of polymorphisms as homo/heterozygous according to calculated allele frequency

  • To permit the easy application of our method and to allow users to rapidly analyse bulk segregant sequence data we have produced a range of implementations of the CHERIPIC algorithm

Read more

Summary

Introduction

Traditional Map based Cloning approaches, used for the identification of desirable alleles, are extremely labour intensive and years can elapse between the mutagenesis and the detection of the polymorphism. CHERIPIC makes use of short contig fragments (such as those from the first pass assembly of Illumina data or a PacBio sequence run) from bulk segregant sequencing (BSS) experiments to call variants and to reduce the list of candidates to a few closely linked variants in the region harbouring the trait of interest and in some cases includes the candidate mutation as well. CHERIPIC improves on previous methods by being input type agnostic, working well on genome-seq and RNA-seq data, having extremely low computational requirements and being available for direct use through a web interface

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call