Abstract
Despite being a well-established research method, the use of whole-genome sequencing (WGS) for routine molecular typing and pathogen characterization remains a substantial challenge due to the required bioinformatics resources and/or expertise. Moreover, many national reference laboratories and centers, as well as other laboratories working under a quality system, require extensive validation to demonstrate that employed methods are “fit-for-purpose” and provide high-quality results. A harmonized framework with guidelines for the validation of WGS workflows does currently, however, not exist yet, despite several recent case studies highlighting the urgent need thereof. We present a validation strategy focusing specifically on the exhaustive characterization of the bioinformatics analysis of a WGS workflow designed to replace conventionally employed molecular typing methods for microbial isolates in a representative small-scale laboratory, using the pathogen Neisseria meningitidis as a proof-of-concept. We adapted several classically employed performance metrics specifically toward three different bioinformatics assays: resistance gene characterization (based on the ARG-ANNOT, ResFinder, CARD, and NDARO databases), several commonly employed typing schemas (including, among others, core genome multilocus sequence typing), and serogroup determination. We analyzed a core validation dataset of 67 well-characterized samples typed by means of classical genotypic and/or phenotypic methods that were sequenced in-house, allowing to evaluate repeatability, reproducibility, accuracy, precision, sensitivity, and specificity of the different bioinformatics assays. We also analyzed an extended validation dataset composed of publicly available WGS data for 64 samples by comparing results of the different bioinformatics assays against results obtained from commonly used bioinformatics tools. We demonstrate high performance, with values for all performance metrics >87%, >97%, and >90% for the resistance gene characterization, sequence typing, and serogroup determination assays, respectively, for both validation datasets. Our WGS workflow has been made publicly available as a “push-button” pipeline for Illumina data at https://galaxy.sciensano.be to showcase its implementation for non-profit and/or academic usage. Our validation strategy can be adapted to other WGS workflows for other pathogens of interest and demonstrates the added value and feasibility of employing WGS with the aim of being integrated into routine use in an applied public health setting.
Highlights
Whole-genome sequencing (WGS) has become a well-established technique, spurred by the rapid development of different nextgeneration sequencing (NGS) technologies, and ample case studies have been published in recent years that demonstrate the added value of WGS for surveillance monitoring and outbreak cases for many microbial pathogens of interest in public health (Mellmann et al, 2011; Kwong et al, 2015; Aanensen et al, 2016; Charpentier et al, 2017; Harris et al, 2018)
Our study is relevant because recent surveys by both the European Food Safety Authority (EFSA) (García Fierro et al, 2018) and the European Centre for Disease Prevention and Control (ECDC) (Revez et al, 2017) have indicated that, at least in Europe, the data analysis and required expertise remain substantial bottlenecks impeding the implementation of NGS for routine use in microbiology
A series of recently published studies have, showcased the need thereof, and presented validation approaches for certain components of the WGS workflow focusing either on a modular template for the validation of WGS processes (Kozyreva et al, 2017), the entire workflow “end-to-end” (Portmann et al, 2018), standardization (Holmes et al, 2018), external quality assessment (Mellmann et al, 2017), commercial bioinformatics software (Lindsey et al, 2016), outbreak clustering (Dallman et al, 2015), or specific assays such as serotyping (Yachison et al, 2017). We complement these studies by proposing a validation strategy focusing on the bioinformatics analysis of the WGS workflow to exhaustively evaluate performance at this level, which is crucial because the bioinformatics component serves as the “common denominator” that allows to compare the different steps of the WGS workflow or even different WGS workflows and/or sequencing technologies
Summary
Whole-genome sequencing (WGS) has become a well-established technique, spurred by the rapid development of different nextgeneration sequencing (NGS) technologies, and ample case studies have been published in recent years that demonstrate the added value of WGS for surveillance monitoring and outbreak cases for many microbial pathogens of interest in public health (Mellmann et al, 2011; Kwong et al, 2015; Aanensen et al, 2016; Charpentier et al, 2017; Harris et al, 2018). In Europe, recent surveys in 2016 by both the European Food Safety Authority (EFSA) (García Fierro et al, 2018) and the European Centre for Disease Prevention and Control (ECDC) (Revez et al, 2017) indicated that NGS was being used in 17 out of 30 and 25 out 29 responding constituents, respectively, and that large discrepancies existed between different European countries in the advancement of implementing this technology for different microbial pathogens of interest, for which the lack of expertise and financial resources were often quoted
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.