Abstract
BackgroundSingle cell Strand-seq is a unique tool for the discovery and phasing of genomic inversions. Conventional methods to discover inversions with Strand-seq data are blind to known inversion locations, limiting their statistical power for the detection of inversions smaller than 10 Kb. Moreover, the methods rely on manual inspection to separate false and true positives.ResultsHere we describe “InvertypeR”, a method based on a Bayesian binomial model that genotypes inversions using fixed genomic coordinates. We validated InvertypeR by re-genotyping inversions reported for three trios by the Human Genome Structural Variation Consortium. Although 6.3% of the family inversion genotypes in the original study showed Mendelian discordance, this was reduced to 0.5% using InvertypeR. By applying InvertypeR to published inversion coordinates and predicted inversion hotspots (n = 3701), as well as coordinates from conventional inversion discovery, we furthermore genotyped 66 inversions not previously reported for the three trios.ConclusionsInvertypeR discovers, genotypes, and phases inversions without relying on manual inspection. For greater accessibility, results are presented as phased chromosome ideograms with inversions linked to Strand-seq data in the genome browser. InvertypeR increases the power of Strand-seq for studies on the role of inversions in phenotypic variation, genome instability, and human disease.
Highlights
Single cell Strand-seq is a unique tool for the discovery and phasing of genomic inversions
We noticed that 6.3% (42/667) of the inversion genotypes reported for the three Human Genome Structural Variation Consortium (HGSVC) trios [3] showed Mendelian discordance, in that a parent did not have one of the child’s two alleles (Fig. 1, Supplemental Figure S1, Additional file 1)
We developed a Bayesian bioinformatic program in R, InvertypeR, that analyses Strand-seq data to generate genome-wide phased inversion genotypes
Summary
Single cell Strand-seq is a unique tool for the discovery and phasing of genomic inversions. Whereas copy number variants are readily detected using either short- or long-read human sequencing techniques and microarrays, the detection of inversions is more challenging [2]. This is especially true for inversions flanked by stretches of repetitive DNA that exceed 10 Kb. As a result, inversions are known to cause phenotypic variation and disease, including microdeletion and microduplication syndromes, by suppressing recombination and disrupting genes or regulatory regions [3,4,5,6,7]. Methods that map inversions genomewide will facilitate many novel studies in medical genetics including studies of their functional consequences For such studies, heterozygous inversions should ideally be phased. Strand-seq reads capture only one Hanlon et al BMC Genomics (2021) 22:582 of the two strands of DNA for each homolog, meaning that inversions are visible as groups of mapped reads with a different orientation than their neighbours [3, 8,9,10,11]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.