Abstract

It has been a long debate whether the 98% ‘non-coding’ fraction of human genome can encode functional proteins besides short peptides. With full-length translating mRNA sequencing and ribosome profiling, we found that up to 3330 long non-coding RNAs (lncRNAs) were bound to ribosomes with active translation elongation. With shotgun proteomics, 308 lncRNA-encoded new proteins were detected. A total of 207 unique peptides of these new proteins were verified by multiple reaction monitoring (MRM) and/or parallel reaction monitoring (PRM); and 10 new proteins were verified by immunoblotting. We found that these new proteins deviated from the canonical proteins with various physical and chemical properties, and emerged mostly in primates during evolution. We further deduced the protein functions by the assays of translation efficiency, RNA folding and intracellular localizations. As the new protein UBAP1-AST6 is localized in the nucleoli and is preferentially expressed by lung cancer cell lines, we biologically verified that it has a function associated with cell proliferation. In sum, we experimentally evidenced a hidden human functional proteome encoded by purported lncRNAs, suggesting a resource for annotating new human proteins.

Highlights

  • A fundamental question in biology is how many proteins a human genome can encode

  • We found that 1028∼3330 long non-coding RNAs (lncRNAs) were bound to ribosomes in the nine tested human cell lines

  • 2969 translating lncRNAs possess at least one canonical open reading frame (ORF) that started with AUG and can encode proteins with at least 50 aa in length (Figure 1A and Supplementary Table S1)

Read more

Summary

Introduction

A fundamental question in biology is how many proteins a human genome can encode. To date, 19 467 genes are annotated as protein-coding genes, among which 17 470 proteins have been evidenced at protein level [1]. Until 2013, it had been widely believed that these RNAs do not encode proteins [3], while regulating translation [4,5]. It has been widely debated regarding whether the lncRNAs can encode proteins. Banfai et al have proposed that ribosomes are able to differentiate coding genes from noncoding ones as most predicted open reading frames (ORFs) have upstream stop codons that lead to early translation termination and very short peptide production [6]. Gutman et al proposed with computational approaches that the lncRNAs lack ribosome release behavior at the stop codon, which distinguishes them from coding RNAs, and concluded that the lncRNAs do not encode proteins [3]. Jackson et al recently found in mice that short and non-ATG-initiated open reading frames (ORFs) in non-protein coding genes could express proteins [10]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.