Abstract
BackgroundTargeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. Unfortunately, this level of resolution is not sufficient in many applications where strain-level identification of bacteria is required (biodefense, forensics, clinical diagnostics, and outbreak investigations). Adding more genomic targets will increase the resolution, but the challenge is identifying the appropriate targets. VaST was developed to address this challenge by finding the minimum number of targets that, in combination, achieve maximum strain-level resolution for any strain complex. The final combination of target regions identified by the algorithm produce a unique haplotype for each strain which can be used as a fingerprint for identifying unknown samples in a TAS assay. VaST ensures that the targets have conserved primer regions so that the targets can be amplified in all of the known strains and it also favors the inclusion of targets with basal variants which makes the set more robust when identifying previously unseen strains.ResultsWe analyzed VaST’s performance using a number of different pathogenic species that are relevant to human disease outbreaks and biodefense. The number of targets required to achieve full resolution ranged from 20 to 88% fewer sites than what would be required in the worst case and most of the resolution is achieved within the first 20 targets. We computationally and experimentally validated one of the VaST panels and found that the targets led to accurate phylogenetic placement of strains, even when the strains were not a part of the original panel design.ConclusionsVaST is an open source software that, when provided a set of variant sites, can find the minimum number of sites that will provide maximum resolution of a strain complex, and it has many different run-time options that can accommodate a wide range of applications. VaST can be an effective tool in the design of strain identification panels that, when combined with TAS technologies, offer an efficient and inexpensive strain typing protocol.
Highlights
Targeted Polymerase Chain Reaction (PCR) amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains
Benchmarking We benchmarked VaST’s performance using 6 bacterial strain complexes: 537 strains of Escherichia coli using 189,570 Single Nucleotide Polymorphism (SNP), 373 strains of Burkholderia pseudomallei using 94,647 SNPs, 269 strains of Yersinia pestis using 11,249 SNPs, 186 strains of Bacillus anthracis using 11,989 SNPs, 64 strains of Francisella tularensis using 16,720 SNPs, and 122 strains of Staphylococcus aureus using 169,382 SNPs. These pathogens were chosen based on their relevance to human disease outbreaks and their potential for use as biothreat agents
We generated minimum spanning sets for each strain complex to demonstrate how well VaST performs in a number of genomic contexts
Summary
Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. This is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. PCR enrichment of target sequences allows TAS to be more cost effective than whole genome sequencing and tolerant to low amounts of starting material [8] Combining this with HTS technology allows scaled processing of hundreds to thousands of samples on Furstenau et al BMC Bioinformatics (2018) 19:222 a single machine. The challenge is deciding which targets to choose to achieve the desired outcome
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.