The influenza A virus contains 8 segmented genomic RNAs and was considered to encode 10 viral proteins until investigators identified the 11th viral protein, PB1-F2, which uses an alternative reading frame of the PB1 gene. The recently identified PB1-N40, PA-N155 and PA-N182 influenza A proteins have shown the potential for using a leaking ribosomal scanning mechanism to generate novel open reading frames (ORFs). These novel ORFs provide examples of the manner in which the influenza A virus expands its coding capacity by using overlapping reading frames. In this study, we performed a computational search, based on a ribosome scanning mechanism, on all influenza A coding sequences to identify possible forward-reading ORFs that could be translated into novel viral proteins. We specified that the translated products had a prevalence ≥5% to eliminate sporadic ORFs. A total of 1,982 ORFs were thus identified and presented in terms of their locations, lengths and Kozak sequence strengths. We further provided an abridged list of ORFs by requiring every candidate an upstream start codon (within the upstream third of the primary transcript), a strong Kozak consensus sequence and high prevalence (≥95% and ≥50% for in-frame and alternative-frame ORFs, respectively). The PB1-F2, PB1-N40, PA-N155 and PA-N182 proteins all fulfilled our filtering criteria. Subject to these three stringent settings, we additionally named 16 novel ORFs for all influenza A genomes except for HA and NA, for which 43 HA and 11 NA ORFs from their respective subtypes were also recognized.
Read full abstract