Abstract

SummaryGenotype Query Tools (GQT) were developed to discover disease-causing variations from billions of genotypes and millions of genomes, processes data at substantially higher speed over other existing methods. While GQT has been available to a wide audience as command-line software, the difficulty of constructing queries among non-IT or non-bioinformatics researchers has limited its applicability. To overcome this limitation, we developed webGQT, an easy-to-use tool with a graphical user interface. With pre-built queries across three modules, webGQT allows for pedigree analysis, case-control studies, and population frequency studies. As a package, webGQT allows researchers with less or no applied bioinformatics/IT experience to mine potential disease-causing variants from billions.ResultswebGQT offers a flexible and easy-to-use interface for model-based candidate variant filtering for Mendelian diseases from thousands to millions of genomes at a reduced computation time. Additionally, webGQT provides adjustable parameters to reduce false positives and rescue missing genotypes across all modules. Using a case study, we demonstrate the applicability of webGQT to query non-human genomes. In addition, we demonstrate the scalability of webGQT on large data sets by implementing complex population-specific queries on the 1000 Genomes Project Phase 3 data set, which includes 8.4 billion variants from 2504 individuals across 26 different populations. Furthermore, webGQT supports filtering single-nucleotide variants, short insertions/deletions, copy number or any other variant genotypes supported by the VCF specification. Our results show that webGQT can be used as an online web service, or deployed on personal computers or local servers within research groups.AvailabilitywebGQT is made available to the users in three forms: 1) as a webserver available at https://vm1138.kaj.pouta.csc.fi/webgqt/, 2) as an R package to install on personal computers, and 3) as part of the same R package to configure on the user's own servers. The application is available for installation at https://github.com/arumds/webgqt.

Highlights

  • Exome sequencing, genome sequencing, and gene panel sequencing methods have become the de facto methods for studying the heritability of human and non-human genetic diseases involving small pedigrees to large-scale population cohorts

  • The first step in using web server for GQT (webGQT) is to select the type of variant database to perform the query. webGQT is deployed with 1000

  • To use webGQT on a custom data set through the web server, the user is either required to upload the Genotype Query Tools (GQT) index files by clicking the “Upload variant call format (VCF)” button or to deploy the application with a default data set on their personal computer or server

Read more

Summary

Introduction

Genome sequencing, and gene panel sequencing methods have become the de facto methods for studying the heritability of human and non-human genetic diseases involving small pedigrees to large-scale population cohorts. The current wealth of sequenced genomes produces billions of genotypes in the variant call format (VCF), requiring publicly available programs to effectively and rapidly filter candidate variants for personalized disease and population genomics. Genotype Query Tools (GQT), a command line software (Layer et al, 2016), was developed to query and scale-up to the billions of loci from the UK 100,000 Genomes Project (Samuel and Farsides, 2017) and the expected millions of microbial, plant and animal genomes (Stephens et al, 2015) using a Word-Aligned Hybrid (WAH) compressed bitmap index. Users use the GQT command line interface to filter inherited variants among small pedigrees, case-control variant filtering, and comparing variants among different cohorts. While GQT is available to a wide audience as command line software, the difficulty in constructing queries faced by non-IT or non-bioinformatics researchers has limited its applicability among many users

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.