Abstract

Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5′ and 3′ UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org).

Highlights

  • Zea mays L. is one of the world’s most important crop plants as well as a model organism for analysis of the impact of transposable elements on genome structure and gene expression [1] and the premier example of allelic diversity within an organism [2]

  • The Full-length cDNA (FLcDNA) are beneficial in determining the exon/intron structure of genes by aligning them to the sequenced genome; 94% of our FLcDNAs aligned to the maize genome

  • The 27,455 FLcDNAs were compared to gene sequences for rice, sorghum, Arabidopsis, and poplar; 22,874 were found in all four sets, and 1,737 were unique to maize

Read more

Summary

Introduction

Zea mays L. is one of the world’s most important crop plants as well as a model organism for analysis of the impact of transposable elements on genome structure and gene expression [1] and the premier example of allelic diversity within an organism [2]. The success of approximately 100 years of maize genetic analysis is based on the functional diploidy of many loci, as the loss of one gene function is sufficient to generate a scoreable phenotype. There is a surprisingly high occurrence of local gene duplications in flowering plants, compared to animal genomes; for example, 12.6–16.6% of loci in Arabidopsis thaliana are estimated to be adjacent to a gene family member [4], and this estimate increases to one-third of the genes in maize [5]. A subset of these local duplications are either very recent or have been corrected by recombination; the frequency of nearly identical neighboring genes is estimated to exceed 1% in maize [6]. The entire genome was duplicated approximately 5–12 million years ago (MYA) [7]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.