Abstract
MotivationWith the availability of new sequencing technologies, the generation of haplotype-resolved genome assemblies up to chromosome scale has become feasible. These assemblies capture the complete genetic information of both parental haplotypes, increase structural variant (SV) calling sensitivity and enable direct genotyping and phasing of SVs. Yet, existing SV callers are designed for haploid genome assemblies only, do not support genotyping or detect only a limited set of SV classes.ResultsWe introduce our method SVIM-asm for the detection and genotyping of six common classes of SVs from haploid and diploid genome assemblies. Compared against the only other existing SV caller for diploid assemblies, DipCall, SVIM-asm detects more SV classes and reached higher F1 scores for the detection of insertions and deletions on two recently published assemblies of the HG002 individual.Availability and implementationSVIM-asm has been implemented in Python and can be easily installed via bioconda. Its source code is available at github.com/eldariont/svim-asm.Supplementary information Supplementary data are available at Bioinformatics online.
Highlights
As one of the main classes of genomic variation, structural variants (SVs) comprise a diverse range of genomic rearrangements with sizes larger than 50 bps
SVIM-asm follows a similar workflow as SVIM, several adaptions have been made to consider the unique properties of assembly alignments compared to read alignments
SVIM-asm performed slightly better than DipCall with F1 scores of 93.2% (Assembly A) and 93.7% (Assembly B) compared to 91.7% and 92.5%, respectively
Summary
As one of the main classes of genomic variation, structural variants (SVs) comprise a diverse range of genomic rearrangements with sizes larger than 50 bps. Due to the availability of affordable and accurate nextgeneration sequencing (NGS) technology, SVs are commonly detected by the analysis of sequencing reads. The reads from a genome under investigation are aligned to an existing reference genome to reveal differences between both genomes (readbased SV calling). De novo assembly uses sequence overlaps between reads to computationally reconstruct longer genomic fragments, called contigs. These assembly contigs can be aligned to a reference or comparison genome to facilitate the detection of SVs (assembly-based SV calling) (Sedlazeck et al, 2018)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.