Abstract
At least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. We recommend calling the 39 codon Spike-overlapping ORF ORF2b; the 41, 57, and 22 codon ORF3a-overlapping ORFs ORF3c, ORF3d, and ORF3b; the 33 codon ORF3d isoform ORF3d-2; and the 97 and 73 codon Nucleocapsid-overlapping ORFs ORF9b and ORF9c. Finally, we document conflicting usage of the name ORF3b in 32 studies, and consequent erroneous inferences, stressing the importance of reserving identical names for homologs. We recommend that authors referring to these ORFs provide lengths and coordinates to minimize ambiguity caused by prior usage of alternative names.
Highlights
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the recently identified strain (F. Wu et al, 2020; Zhou et al, 2020; Zhu et al, 2020) of the species Severe acute respiratory syndrome-related coronavirus in the family Coronaviridae (Gorbalenya et al, 2020) that is the causative agent of coronavirus disease 2019 (COVID-19)
The 5' end of the open reading frames (ORFs) might be moved to a site with a known stop codon readthrough or frameshift signal, as in the case of ORF1b, in order to accommodate the complexity of genome expression in viruses. (Note that, we require an ORF to end with a stop codon, we do not include the stop codon when we report the lengths and coordinates of the ORF.) We do not require that an ORF exceeds some minimum length or that undisputed evidence is available for its translation into a protein
The conceptual translation of the nucleotide sequence using a codon table determines whether a genome region is an ORF, whereas experimental or computational evidence is needed to determine if an ORF is translated and encodes a functional protein during virus infection
Summary
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the recently identified strain (F. Wu et al, 2020; Zhou et al, 2020; Zhu et al, 2020) of the species Severe acute respiratory syndrome-related coronavirus in the family Coronaviridae (subgenus Sarbecovirus, genus Betacoronavirus, subfamily Orthocoronavirinae) (Gorbalenya et al, 2020) that is the causative agent of coronavirus disease 2019 (COVID-19). Various authors have referred to the 97 and 73 codon SARS-CoV-2 ORFs overlapping N, respectively, as ORF9a and ORF9b
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have