Abstract

Over the past decade, we have gained considerable insight into the identification of sequence variation within the rDNA array of Saccharomyces cerevisiae and its closest wild relative, Saccharomyces paradoxus. Yet considerable challenges remain in the computational characterisation of this complex genomic region. This study aimed to evaluate the use of variation graphs for this purpose, formally comparing their effectiveness with traditional linear approaches. Specifically, we aimed to identify both partial and fixed variants (i.e. pSNPs, SNPs, pINDELs and INDELs) in the rDNA arrays of 10 diverse, haploid Saccharomyces cerevisiae strains with high quality genomic datasets. We constructed two computational pipelines using two highly different approaches. The first pipeline used the BWA read mapper and the BCFtools variant caller to identify variants against the linear S288c reference, with the second pipeline using the vg tool to call variants against a graphical reference (either based on a graphical representation of the S288c genome or a Saccharomyces cerevisiae pan-genome). The results showed that the graph-based pipeline was able to identify more variants than the linear pipeline, and in particular partial variants, while also missing some key variants identified by BWA/BCFtools. A major discrepancy between the two pipelines was found in the read coverage at loci where the vg pipeline identified variants. In the coming months, we aim to investigate the cause of these differences and to develop a new graph-based computational pipeline that can accurately identify the full range of sequence and copy number variation within this key genomic region.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.