Abstract

Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5–6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA) was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV) in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs), 10 HPeVs, 1 human adenovirus (HAdV), 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.

Highlights

  • Next-generation sequencing (NGS) technology has revolutionized virus discovery [1,2] and metagenomics [3,4] in the past decade

  • The overall methodology included the following steps: 1) NGS was performed using the Illumina system for collected clinical samples, 2) NGS reads were preprocessed using bioinformatics tools, and sequence homologs of assembled contigs were searched for BLASTN, 3) NGS diagnoses were validated using Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA), 4) NGS analysis was performed on single isolate, if no virus was detected in PCR or ELISA confirmation, 5) an iterative mapping was implemented to obtain viral genome, if no genome was provided, 6) reference mapping was performed to identify genetic variants, and 7) genetic diversity was investigated using phylogenetic and recombination analyses

  • human rhinoviruses (HRVs) were mostly detected in these clinical samples, we focused on the detected human parechovirus (HPeV) and explored their genetic characteristics because of their clinical significance

Read more

Summary

Introduction

Next-generation sequencing (NGS) technology has revolutionized virus discovery [1,2] and metagenomics [3,4] in the past decade. This technology provides a considerable number of reads from different pathogens without a priori knowledge about them. These considerable number of reads are assembled into contigs by computer algorithms [5,6]. The de novo assembled NGS reads often yield only short contigs that are partially aligned to or even do not match the database sequences [8], possibly because the tentative virus or similar strains have never been sequenced and do not appear in the database or because the virus mutated considerably (through point accumulation or recombination)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call