Abstract

BackgroundAll standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. The aim of this work was to estimate mRNA open reading frame (ORF) 5' region sequence completeness in the model organism Danio rerio (zebrafish).ResultsWe implemented a novel automated approach (5'_ORF_Extender) that systematically compares available expressed sequence tags (ESTs) with all the zebrafish experimentally determined mRNA sequences, identifies additional sequence stretches at 5' region and scans for the presence of all conditions needed to define a new, extended putative ORF. Our software was able to identify 285 (3.3%) mRNAs with putatively incomplete ORFs at 5' region and, in three example cases selected (selt1a, unc119.2, nppa), the extended coding region at 5' end was cloned by reverse transcription-polymerase chain reaction (RT-PCR).ConclusionThe implemented method, which could also be useful for the analysis of other genomes, allowed us to describe the relevance of the "5' end mRNA artifact" problem for genomic annotation and functional genomic experiment design in zebrafish.Open peer reviewThis article was reviewed by Alexey V. Kochetov (nominated by Mikhail Gelfand), Shamil Sunyaev, and Gáspár Jékely. For the full reviews, please go to the Reviewers' Comments section.

Highlights

  • All standard methods for DNA complementary to RNA (cDNA) cloning are affected by a potential inability to effectively clone the 5' region of messenger RNA (mRNA)

  • Database construction and computational analysis The high-throughput Basic Local Alignment Search Tool (BLAST) analysis generated 1,189,412 BLAST hit lines for the 8,528 investigated zebrafish mRNAs compared with the Danio rerio expressed sequence tag (EST) database

  • Following calculations executed by the 5'_ORF_Extender software, it was possible to obtain candidate extended coding regions at 5' end from 1,346 BLAST hits, using the criteria described in the "Methods" section

Read more

Summary

Introduction

All standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. The aim of this work was to estimate mRNA open reading frame (ORF) 5' region sequence completeness in the model organism Danio rerio (zebrafish). The amino acid sequence of gene products is routinely deduced from the nucleotide sequence of the relative cloned cDNA, according to rules for recognition of start codon (first-AUG rule, optimal sequence context) and the genetic code [1,2]. The identification of a more complete mRNA 5' end could reveal an additional upstream AUG – in-frame with the previously determined one and in the optimal context – extending the predicted amino terminus sequence of the product. The putative translation start based on incomplete mRNA sequence may lead to incorrect prediction of the product amino acid sequence, and to subsequent errors in the experimental cloning and functional assay of the relative cDNA. Methods to determine the complete mRNA ORF have been developed, such as 5' cap trapping [6] and cap analysis of gene expression (CAGE) [7], they are experimentally intensive and they have not been applied to the zebrafish mRNA on a large scale

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call