non-AUG Start Codons Research Articles

Introns are found in 5′ untranslated regions (5′UTRs) for 35% of all human transcripts. These 5′UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5′UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5′UTR intron status, we developed a classifier that can predict 5′UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5′ proximal-intron-minus-like-coding regions (“5IM” transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5′ cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5′ proximal positions. Finally, N1-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5′ proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N1-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC.

Read full abstract

An increasing number of studies involve integrative analysis of gene and protein expression data, taking advantage of new technologies such as next-generation transcriptome sequencing and highly sensitive mass spectrometry (MS) instrumentation. Recently, a strategy, termed ribosome profiling (or RIBO-seq), based on deep sequencing of ribosome-protected mRNA fragments, indirectly monitoring protein synthesis, has been described. We devised a proteogenomic approach constructing a custom protein sequence search space, built from both Swiss-Prot- and RIBO-seq-derived translation products, applicable for MS/MS spectrum identification. To record the impact of using the constructed deep proteome database, we performed two alternative MS-based proteomic strategies as follows: (i) a regular shotgun proteomic and (ii) an N-terminal combined fractional diagonal chromatography (COFRADIC) approach. Although the former technique gives an overall assessment on the protein and peptide level, the latter technique, specifically enabling the isolation of N-terminal peptides, is very appropriate in validating the RIBO-seq-derived (alternative) translation initiation site profile. We demonstrate that this proteogenomic approach increases the overall protein identification rate 2.5% (e.g. new protein products, new protein splice variants, single nucleotide polymorphism variant proteins, and N-terminally extended forms of known proteins) as compared with only searching UniProtKB-SwissProt. Furthermore, using this custom database, identification of N-terminal COFRADIC data resulted in detection of 16 alternative start sites giving rise to N-terminally extended protein variants besides the identification of four translated upstream ORFs. Notably, the characterization of these new translation products revealed the use of multiple near-cognate (non-AUG) start codons. As deep sequencing techniques are becoming more standard, less expensive, and widespread, we anticipate that mRNA sequencing and especially custom-tailored RIBO-seq will become indispensable in the MS-based protein or peptide identification process. The underlying mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000124.

Read full abstract

non-AUG Start Codons Research Articles

Related Topics

Articles published on non-AUG Start Codons

Ribosomal Proteins Regulate MHC Class I Peptide Generation for Immunosurveillance

Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5' leader of mRNAs in Arabidopsis thaliana.

Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data.

Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons.

Non-AUG translation: a new start for protein synthesis in eukaryotes

Non-AUG start codons responsible for ABO weak blood group alleles on initiation mutant backgrounds

A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N1-methyladenosine modification.

Divergent hepatitis E virus in birds of prey, common kestrel (Falco tinnunculus) and red-footed falcon (F. vespertinus), Hungary

Ribosome Structure Reveals Preservation of Active Sites in the Presence of a P-Site Wobble Mismatch

Translational Initiation at a Non-AUG Start Codon for Human and Mouse Negative Elongation Factor-B.

Non-Canonical Start Codons Reinitiate Translation in N-Terminal Truncated Kv Channels

Discovery of Human sORF-Encoded Polypeptides (SEPs) in Cell Lines and Tissue

Deep Proteome Coverage Based on Ribosome Profiling Aids Mass Spectrometry-based Protein and Peptide Discovery and Provides Evidence of Alternative Translation Products and Near-cognate Translation Initiation Events*

Tuning gene expression with synthetic upstream open reading frames

Fluorescent Proteins and in Vitro Genetic Organization for Cell-Free Synthetic Biology

The Stringency of Start Codon Selection in the Filamentous Fungus Neurospora crassa

Peptidomic discovery of short open reading frame–encoded peptides in human cells

Hidden coding potential of eukaryotic genomes: nonAUG started ORFs

Molecular characterization of Banana streak virus isolate from Musa Acuminata in China

Functional Elements in Initiation Factors 1, 1A, and 2β Discriminate against Poor AUG Context and Non-AUG Start Codons

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

non-AUG Start Codons Research Articles

Related Topics

Articles published on non-AUG Start Codons

Ribosomal Proteins Regulate MHC Class I Peptide Generation for Immunosurveillance

Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5' leader of mRNAs in Arabidopsis thaliana.

Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data.

Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons.

Non-AUG translation: a new start for protein synthesis in eukaryotes

Non-AUG start codons responsible for ABO weak blood group alleles on initiation mutant backgrounds

A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N1-methyladenosine modification.

Divergent hepatitis E virus in birds of prey, common kestrel (Falco tinnunculus) and red-footed falcon (F. vespertinus), Hungary

Ribosome Structure Reveals Preservation of Active Sites in the Presence of a P-Site Wobble Mismatch

Translational Initiation at a Non-AUG Start Codon for Human and Mouse Negative Elongation Factor-B.

Non-Canonical Start Codons Reinitiate Translation in N-Terminal Truncated Kv Channels

Discovery of Human sORF-Encoded Polypeptides (SEPs) in Cell Lines and Tissue

Deep Proteome Coverage Based on Ribosome Profiling Aids Mass Spectrometry-based Protein and Peptide Discovery and Provides Evidence of Alternative Translation Products and Near-cognate Translation Initiation Events*

Tuning gene expression with synthetic upstream open reading frames

Fluorescent Proteins and in Vitro Genetic Organization for Cell-Free Synthetic Biology

The Stringency of Start Codon Selection in the Filamentous Fungus Neurospora crassa

Peptidomic discovery of short open reading frame–encoded peptides in human cells

Hidden coding potential of eukaryotic genomes: nonAUG started ORFs

Molecular characterization of Banana streak virus isolate from Musa Acuminata in China

Functional Elements in Initiation Factors 1, 1A, and 2β Discriminate against Poor AUG Context and Non-AUG Start Codons