Abstract

Upstream open reading frames (uORFs) are open reading frames that occur within the 5' UTR of an mRNA. uORFs have been found in many organisms. They play an important role in gene regulation, cell development, and in various metabolic processes. It is believed that translated uORFs reduce the translational efficiency of the main coding region. However, only few uORFs are experimentally characterized. In this paper, we use ribosome footprinting together with a semi-supervised approach based on stacking classification models to identify translated uORFs in Arabidopsis thaliana. Our approach identified 5360 potentially translated uORFs in 2051 genes. GO terms enriched in genes with translated uORFs include catalytic activity, binding, transferase activity, phosphotransferase activity, kinase activity, and transcription regulator activity. The reported uORFs occur with a higher frequency in multi-isoform genes, and some uORFs are affected by alternative transcript start sites or alternative splicing events. Association rule mining revealed sequence features associated with the translation status of the uORFs. We hypothesize that uORF translation is a complex process that might be regulated by multiple factors. The identified uORFs are available online at:https://www.dropbox.com/sh/zdutupedxafhly8/AABFsdNR5zDfiozB7B4igFcja?dl=0. This paper is the extended version of our research presented at ISBRA 2015.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.