Abstract

BackgroundGenome graph is an emerging approach for representing structural variants on genomes with branches. For example, representing structural variants of cancer genomes as a genome graph is more natural than representing such genomes as differences from the linear reference genome. While more and more structural variants are being identified by long-read sequencing, many of them are difficult to visualize using existing structural variants visualization tools. To this end, visualization method for large genome graphs such as human cancer genome graphs is demanded.ResultsWe developed MOdular Multi-scale Integrated Genome graph browser, MoMI-G, a web-based genome graph browser that can visualize genome graphs with structural variants and supporting evidences such as read alignments, read depth, and annotations. This browser allows more intuitive recognition of large, nested, and potentially more complex structural variations. MoMI-G has view modules for different scales, which allow users to view the whole genome down to nucleotide-level alignments of long reads. Alignments spanning reference alleles and those spanning alternative alleles are shown in the same view. Users can customize the view, if they are not satisfied with the preset views. In addition, MoMI-G has Interval Card Deck, a feature for rapid manual inspection of hundreds of structural variants. Herein, we describe the utility of MoMI-G by using representative examples of large and nested structural variations found in two cell lines, LC-2/ad and CHM1.ConclusionsUsers can inspect complex and large structural variations found by long-read analysis in large genomes such as human genomes more smoothly and more intuitively. In addition, users can easily filter out false positives by manually inspecting hundreds of identified structural variants with supporting long-read alignments and annotations in a short time.Software availabilityMoMI-G is freely available at https://github.com/MoMI-G/MoMI-G under the MIT license.

Highlights

  • Genome graph is an emerging approach for representing structural variants on genomes with branches

  • Our main contribution is the development of a genome graph browser, MoMI-G, we developed the MoMI-G tools that convert the output of Structural variants (SV) callers into genome graphs as a first step towards that goal

  • A script that converts a FASTA file of a reference genome and a variant call format (VCF) file into an XG file is included in the MoMI-G package, the VCF format cannot represent some types of SVs that the XG format can represent, such as nested insertions

Read more

Summary

Introduction

Genome graph is an emerging approach for representing structural variants on genomes with branches. In the era of long read sequencing, a typical analysis of SVs in a species usually starts with aligning long wholegenome shotgun reads with a reference genome, after which we identify SVs as large differences between the reads and the reference genome [4, 5] This approach does not work for certain regions in a genome. Yokoyama et al BMC Bioinformatics (2019) 20:548 commonly used today [6,7,8]; not including these sequences in the reference genome may lead us to miss causal genetic variants of diseases Another example is that mutations in a genomic locus with high diversity, such as the human leukocyte antigen region, are hard to identify [9]; this may lead us to miss variants highly associated with diseases or important traits. Genome graph is referred to as graph genome [12, 13]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call