Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have been emerging and circulating globally since the start of the COVID-19 pandemic, of which B.1.617 lineage that was first reported in India at the end of 2020, soon became predominant. Tracing genomic variations and understanding their impact on the viral properties are the foundations for the vaccine and drug development and for the mitigation measures to be taken or lifted. In this study, 1,051 near-complete genomes and 1,559 spike (S) sequences belonging to the B.1.617 were analyzed. A genome-wide spread of single nucleotide polymorphisms (SNPs) was identified. Of the high frequency mutations identified, 61% (11/18) involved structural proteins, despite two third of the viral genome encoding nonstructural proteins. There were 22 positive selection sites, mostly distributed across the S protein, of which 16 were led by non-C to U transition and should be of a special attention. Haplotype network revealed that a large number of daughter haplotypes were continually derived throughout the pandemic, of which H177, H181 H219 and H286 from the ancestor haplotype H176 of B.1.617.2 were widely prevalent. Besides the well known substitutions of L452R, P681R and deletions of E156 and F157, as well as the potential biological significance, structural analysis in this study still indicated that new amino acid changes in B.1.617, such as E484Q and N501Y, had reshaped the viral bonding network, and increasingly sequenced N501Y mutant with a potential enhanced binding ability was detected in many other countries in the follow-up monitoring. Although we can’t conclude the properties of all the mutants including N501Y thoroughly, it merits focusing on their spread epidemically and biologically.
Read full abstract