Abstract
SummaryQuality control (QC) of genome wide association study (GWAS) result files has become increasingly difficult due to advances in genomic technology. The main challenges include continuous increases in the number of polymorphic genetic variants contained in recent GWASs and reference panels, the rising number of cohorts participating in a GWAS consortium, and inclusion of new variant types. Here, we present GWASinspector, a flexible R package for comprehensive QC of GWAS results. This package is compatible with recent imputation reference panels, handles insertion/deletion and multi-allelic variants, provides extensive QC reports and efficiently processes big data files. Reference panels covering three human genome builds (NCBI36, GRCh37 and GRCh38) are available. GWASinspector has a user friendly design and allows easy set-up of the QC pipeline through a configuration file. In addition to checking and reporting on individual files, it can be used in preparation of a meta-analysis by testing for systemic differences between studies and generating cleaned, harmonized GWAS files. Comparison with existing GWAS QC tools shows that the main advantages of GWASinspector are its ability to more effectively deal with insertion/deletion and multi-allelic variants and its relatively low memory use.Availability and implementationOur package is available at The Comprehensive R Archive Network (CRAN): https://CRAN.R-project.org/package=GWASinspector. Reference datasets and a detailed tutorial can be found at the package website at http://gwasinspector.com/.Supplementary information Supplementary data are available at Bioinformatics online.
Highlights
Recent genome-wide association studies (GWASs) use imputation reference panels based on next-generation sequencing technology
The main challenges include continuous increases in the number of polymorphic genetic variants contained in recent GWASs and reference panels, the rising number of cohorts participating in a GWAS consortium, and inclusion of new variant types
We present GWASinspector, a flexible R package for comprehensive Quality control (QC) of GWAS results. This package is compatible with recent imputation reference panels, handles insertion/deletion and multi-allelic variants, provides extensive QC reports and efficiently processes big data files
Summary
Recent genome-wide association studies (GWASs) use imputation reference panels based on next-generation sequencing technology. Software packages like GWAStools (Gogarten et al, 2012), GWAtoolbox (Fuchsberger et al, 2012), QCGWAS (van der Most et al, 2014) and EasyQC (Winkler et al, 2014) have been previously developed for this purpose These do not properly address current key challenges including diversity of allele frequency reference panels, inclusion of new variant types such as insertion/deletion (indel), and multi-allelic variants. The sheer data size of the result files as well as the reference panel(s) pose a problem This issue is more evident in meta-analysis projects involving numerous result files from multiple sources, which warrants the need for a more time-efficient QC software. This motivated us to develop a new package for the QC of GWAS result files addressing the above mentioned shortcomings. Besides QC of single files, GWASinspector can be used in large-scale consortium projects to check for systematic differences between the reported results from different cohorts and generate cleaned, harmonized GWAS files ready for meta-analysis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.