Abstract

SummaryLinked-read sequencing enables greatly improves haplotype assembly over standard paired-end analysis. The detection of mosaic single-nucleotide variants benefits from haplotype assembly when the model is informed by the mapping between constituent reads and linked reads. Samovar evaluates haplotype-discordant reads identified through linked-read sequencing, thus enabling phasing and mosaic variant detection across the entire genome. Samovar trains a random forest model to score candidate sites using a dataset that considers read quality, phasing, and linked-read characteristics. Samovar calls mosaic single-nucleotide variants (SNVs) within a single sample with accuracy comparable with what previously required trios or matched tumor/normal pairs and outperforms single-sample mosaic variant callers at minor allele frequency 5%–50% with at least 30X coverage. Samovar finds somatic variants in both tumor and normal whole-genome sequencing from 13 pediatric cancer cases that can be corroborated with high recall with whole exome sequencing. Samovar is available open-source at https://github.com/cdarby/samovar under the MIT license.

Highlights

  • Genomic mosaicism results from postzygotic de novo mutations, ranging from single-nucleotide changes to larger structural variants and whole chromosome aneuploidy

  • Somatic mosaicism refers to genetic heterogeneity among non-germ cells, which accrue in normally dividing cells throughout the human lifetime (Gajecka, 2016; Laurie et al, 2012; Kennedy et al, 2012) corroborated by monozygotic twin studies (Ouwens et al, 2018)

  • Samovar Pipeline We present Samovar, a single sample mosaic single-nucleotide variants (SNVs) caller designed for 10X Genomics linked-read wholegenome sequencing (WGS) data

Read more

Summary

Introduction

Genomic mosaicism results from postzygotic de novo mutations, ranging from single-nucleotide changes to larger structural variants and whole chromosome aneuploidy. Mosaic mutations are present in some of the cells belonging to the offspring but in none of either parents’ cells (Biesecker and Spinner, 2013; Cohen et al, 2015). Somatic mosaicism refers to genetic heterogeneity among non-germ cells, which accrue in normally dividing cells throughout the human lifetime (Gajecka, 2016; Laurie et al, 2012; Kennedy et al, 2012) corroborated by monozygotic twin studies (Ouwens et al, 2018). Mosaicism plays an important role in many genetic diseases. Mosaicism has been implicated in autism (Freed and Pevsner, 2016) and is being explored in connection to other neurological disease (Poduri et al, 2013; McConnell et al, 2017; D’Gama and Walsh, 2018). Causal mosaic mutations have been found for Sturge-Weber syndrome (Shirley et al, 2013), McCune-Albright syndrome (Weinstein et al, 1991), and Proteus syndrome (Lindhurst et al, 2011), among others

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.