Abstract

To date, many experiments have revealed that the functional balance between hemagglutinin (HA) and neuraminidase (NA) plays a crucial role in viral mobility, production, and transmission. However, whether and how HA and NA maintain balance at the sequence level needs further investigation. Here, we applied principal component analysis and hierarchical clustering analysis on thousands of HA and NA sequences of A/H1N1 and A/H3N2. We discovered significant coevolution between HA and NA at the sequence level, which is closely related to the type of host species and virus epidemic years. Furthermore, we propose a sequence-to-sequence transformer model (S2STM), which mainly consists of an encoder and a decoder that adopts a multi-head attention mechanism for establishing the mapping relationship between HA and NA sequences. The training results reveal that the S2STM can effectively realize the “translation” from HA to NA or vice versa, thereby building a relationship network between them. Our work combines unsupervised and supervised machine learning methods to identify the sequence matching between HA and NA, which will advance our understanding of IAVs’ evolution and also provide a novel idea for sequence analysis methods.

Highlights

  • Published: 25 February 2022Against the background of coronavirus disease 2019 [1], the influenza A viruses (IAVs) [2] continue to pose a risk and endanger human health

  • We study the sequence matching between HA and NA in A/H1N1 and A/H3N2 strains based on sequence analysis

  • Using principal component analysis (PCA) [25], we reduced it to a matrix with dimensions (n, k), where k is the number of reserved dimensions

Read more

Summary

Introduction

Against the background of coronavirus disease 2019 [1], the influenza A viruses (IAVs) [2] continue to pose a risk and endanger human health. In long-term research, it has been found that two major surface glycoproteins, i.e., hemagglutinin (HA) [3] and neuraminidase (NA) [4], are involved in the process of virus infectivity, replication, and transmission [5]. 18 HA and 11 NA subtypes have been identified, and over 120 combinations have been documented in nature [6]. A/H1N1 and A/H3N2 subtypes circulate in the human population and give rise to seasonal outbreaks [10]. The functional balance between HA and NA is necessary for viral production and interspecies transmission [12]. Viral particles need to penetrate a gel-like mobile mucus layer under the co-regulation of HA and NA in order to reach and, subsequently, infect the underlying epithelial cells [13,14].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call