Background. National epidemiological investigations of microbial infections greatly benefit from the increased information gained by whole-genome sequencing (WGS) in combination with standardized approaches for data sharing and analysis.Aim. To evaluate the quality and accuracy of WGS data generated by different laboratories but analysed by joint pipelines to reach a national surveillance approach.Methods. A national methicillin-resistant Staphylococcus aureus (MRSA) collection of 20 strains was distributed to nine participating laboratories that performed in-house procedures for WGS. Raw data were shared and analysed by three pipelines: 1928 Diagnostics, JASEN (GMS pipeline) and CLC-Genomics Workbench. The outcomes were compared according to quality, correct strain identification and genetic distances.Results. One isolate contained intraspecies contamination and was excluded from further analysis. The mean sequencing depth varied between sites and technologies. However, all analysis methods identified 12 strains that belonged to one of five outbreak clusters. The cut-off definition was set to <10 allele differences for core genome multilocus sequence typing (cgMLST) and <20 genetic differences for SNP analysis in a pairwise comparison.Conclusions. MRSA isolates, which are whole genome sequenced by different laboratories and analysed using the same bioinformatic pipelines, yielded comparable results for outbreak clustering for both cgMLST and SNP, using the 1928 analysis pipeline. In this study, JASEN was best suited to analyse Illumina data and CLC to analyse within respective technology. In the future, real-time sharing of data and harmonized analysis within the Genomic Medicine Sweden consortium will further facilitate investigations of outbreaks and transmission routes.
Read full abstract