Abstract
BackgroundAlternative polyadenylation sites within a gene can lead to alternative transcript variants. Although bioinformatic analysis has been conducted to detect polyadenylation sites using nucleic acid sequences (EST/mRNA) in the public databases, one special type, single-block EST is much less emphasized. This bias leaves a large space to discover novel transcript variants.ResultsIn the present study, we identified novel transcript variants in the human genome by detecting intronic polyadenylation sites. Poly(A/T)-tailed ESTs were obtained from single-block ESTs and clustered into 10,844 groups standing for 5,670 genes. Most sites were not found in other alternative splicing databases. To verify that these sites are from expressed transcripts, we analyzed the supporting EST number of each site, blasted representative ESTs against known mRNA sequences, traced terminal sequences from cDNA clones, and compared with the data of Affymetrix tiling array. These analyses confirmed about 84% (9,118/10,844) of the novel alternative transcripts, especially, 33% (3,575/10,844) of the transcripts from 2,704 genes were taken as high-reliability. Additionally, RT-PCR confirmed 38% (10/26) of predicted novel transcript variants.ConclusionOur results provide evidence for novel transcript variants with intronic poly(A) sites. The expression of these novel variants was confirmed with computational and experimental tools. Our data provide a genome-wide resource for identification of novel human transcript variants with intronic polyadenylation sites, and offer a new view into the mystery of the human transcriptome.
Highlights
Alternative polyadenylation sites within a gene can lead to alternative transcript variants
Mapping and clustering intronic poly(A) sites in the human genome To identify novel transcript variants resulting from previously unidentified intronic poly(A) sites, an annotated expressed sequence tag (EST) alignment file from UCSC Genome Browser http:// genome.ucsc.edu was analyzed (Figure 1)
We focused on single-block ESTs that did not overlap known mRNA sequences
Summary
Alternative polyadenylation sites within a gene can lead to alternative transcript variants. Bioinformatic analysis has been conducted to detect polyadenylation sites using nucleic acid sequences (EST/mRNA) in the public databases, one special type, single-block EST is much less emphasized. This bias leaves a large space to discover novel transcript variants. Recent studies of human tissue transcriptomes by high-throughput sequencing have revealed that about 95% of multiexon genes undergo alternative splicing (AS) [1,2] This greatly enhances previous estimate of human AS events [3,4,5], further adds complexity to transcripts and proteins. These alternative transcripts are often expressed in a tissue-specific pattern, and contribute to some inherited disorders and tumor development [12,13,14,15,16]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.