It is widely discussed that eukaryotic mRNAs can encode several functional polypeptides. Recent progress in NGS and proteomics techniques has resulted in a huge volume of information on potential alternative translation initiation sites and open reading frames (altORFs). However, these data are still incomprehensive, and the vast majority of eukaryotic mRNAs annotated in conventional databases (e.g., GenBank) contain a single ORF (CDS) encoding a protein larger than some arbitrary threshold (commonly 100 amino acid residues). Indeed, some gene functions may relate to the polypeptides encoded by unannotated altORFs, and insufficient information in nucleotide sequence databanks may limit the interpretation of genomics and transcriptomics data. However, despite the need for special experiments to predict altORFs accurately, there are some simple methods for their preliminary mapping.
Read full abstract