Abstract

BackgroundInfluenza viruses exist as a large group of closely related viral genomes, also called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. We compared the suitability of two benchtop next-generation sequencers for whole genome influenza A quasispecies analysis: the Illumina MiSeq sequencing-by-synthesis and the Ion Torrent PGM semiconductor sequencing technique.ResultsWe first compared the accuracy and sensitivity of both sequencers using plasmid DNA and different ratios of wild type and mutant plasmid. Illumina MiSeq sequencing reads were one and a half times more accurate than those of the Ion Torrent PGM. The majority of sequencing errors were substitutions on the Illumina MiSeq and insertions and deletions, mostly in homopolymer regions, on the Ion Torrent PGM. To evaluate the suitability of the two techniques for determining the genome diversity of influenza A virus, we generated plasmid-derived PR8 virus and grew this virus in vitro. We also optimized an RT-PCR protocol to obtain uniform coverage of all eight genomic RNA segments. The sequencing reads obtained with both sequencers could successfully be assembled de novo into the segmented influenza virus genome. After mapping of the reads to the reference genome, we found that the detection limit for reliable recognition of variants in the viral genome required a frequency of 0.5% or higher. This threshold exceeds the background error rate resulting from the RT-PCR reaction and the sequencing method. Most of the variants in the PR8 virus genome were present in hemagglutinin, and these mutations were detected by both sequencers.ConclusionsOur approach underlines the power and limitations of two commonly used next-generation sequencers for the analysis of influenza virus gene diversity. We conclude that the Illumina MiSeq platform is better suited for detecting variant sequences whereas the Ion Torrent PGM platform has a shorter turnaround time. The data analysis pipeline that we propose here will also help to standardize variant calling in small RNA genomes based on next-generation sequencing data.

Highlights

  • Influenza viruses exist as a large group of closely related viral genomes, called quasispecies

  • We used plasmid DNA to compare the accuracy of the sequencing output because it is genetically very stable

  • We generated a plasmid with two tracer mutations, which allowed us to prepare mixtures with different, defined amounts of wild type and mutant plasmid before sequence analysis, in order to determine the sensitivity of the sequencers for picking out the occurrence of the introduced single nucleotide polymorphism (SNP)

Read more

Summary

Introduction

Influenza viruses exist as a large group of closely related viral genomes, called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. Influenza is an acute and highly contagious viral disease of the respiratory tract in humans. It is caused by influenza A and B viruses and occasionally by influenza C virus. Replication of the RNA genome of influenza viruses is associated with a relatively high mutation rate (2.3 × 10−5) because the viral RNAdependent RNA polymerase lacks 3′-5′-exonuclease activity and has no proof-reading function [12,13]. Mutations that are introduced during replication are tolerated because they are neutral for virus fitness in a particular environment, rapidly lost because they reduce fitness, or expanded because they are advantageous [5]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call