Abstract
As genome graphs are powerful data structures for representing the genetic diversity within populations, they can help identify genomic variations that traditional linear references miss, but their complexity and size makes the analysis of genome graphs challenging. We sought to develop a genome graph analysis tool that helps these analyses to become more accessible by addressing the limitations of existing tools. Specifically, we improve scalability and user-friendliness, and we provide many new statistics tailored to variation graphs for graph evaluation, including sample-specific features. We developed an efficient, comprehensive, and integrated tool, gretl, to analyze genome graphs and gain insights into their structure and composition by providing a wide range of statistics. gretl can be utilized to evaluate different graphs, compare the output of graph construction pipelines with different parameters, as well as perform an in-depth analysis of individual graphs, including sample-specific analysis. With the assistance of gretl, novel patterns of genetic variation and potential regions of interest can be identified, for later, more detailed inspection. We demonstrate that gretl outperforms other tools in terms of speed, particularly for larger genome graphs. Commented Rust source code and documentation is available under MIT license at https://github.com/MoinSebi/gretl together with Python scripts and step-by-step usage examples. The package is available at Bioconda for easy installation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have