Abstract

Early characterization of emerging viruses is essential to control their spread, such as the Zika Virus outbreak in 2014. Among other non-viral factors, host information is essential for the surveillance and control of virus spread. Flaviviruses (genus Flavivirus), akin to other viruses, are modulated by high mutation rates and selective forces to adapt their codon usage to that of their hosts. However, a major challenge is the identification of potential hosts for novel viruses. Usually, potential hosts of emerging zoonotic viruses are identified after several confirmed cases. This is inefficient for deterring future outbreaks. In this paper, we introduce an algorithm to identify the host range of a virus from its raw genome sequences. The proposed strategy relies on comparing codon usage frequencies across viruses and hosts, by means of a normalized Codon Adaptation Index (CAI). We have tested our algorithm on 94 flaviviruses and 16 potential hosts. This novel method is able to distinguish between arthropod and vertebrate hosts for several flaviviruses with high values of accuracy (virus group 91.9% and host type 86.1%) and specificity (virus group 94.9% and host type 79.6%), in comparison to empirical observations. Overall, this algorithm may be useful as a complementary tool to current phylogenetic methods in monitoring current and future viral outbreaks by understanding host–virus relationships.

Highlights

  • Accepted: 12 May 2021Recent viral pandemics have shown that rapid characterization of the virus is essential during the development of an outbreak [1,2,3,4]

  • Viral genomes are modulated by high mutation rates [7] and by selective forces to adapt their codon usage to that of their hosts, especially when the viruses can infect a wide host range, as is the case for flaviviruses [8]

  • CAI data between the virus and host (CAIh) values are normalized by dividing each by its respective Codon Adaptation Index (CAI) as in Equation (1): nCAI =. This yields the normalized CAI value, from which the optimal and likely hosts can be inferred depending on how similar the codon usage of a virus is to the codon usage of its host organisms

Read more

Summary

Introduction

Recent viral pandemics have shown that rapid characterization of the virus is essential during the development of an outbreak [1,2,3,4]. Emerging viruses are fully characterized only after several confirmed cases occur; this is an inefficient method of deterring current and future outbreaks [5]. Viral genomes are modulated by high mutation rates [7] and by selective forces to adapt their codon usage to that of their hosts, especially when the viruses can infect a wide host range, as is the case for flaviviruses [8]. Previous methods identify flavivirus host range based on an analysis of dinucleotides [9,10] based on the idea that a virus that infects multiple hosts has a weaker dinucleotide bias [11]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call