Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

Thomas L Parchman,Katherine S Geist,Craig W Benkman,Johan A Grahnen,C Alex Buerkle

doi:10.1186/1471-2164-11-180

Thomas L Parchman, Katherine S Geist + Show 3 more

Open Access

https://doi.org/10.1186/1471-2164-11-180

Copy DOI

Journal: BMC Genomics	Publication Date: Jan 1, 2010
Citations: 459	License type: cc-by

Affiliation: University of Wyoming, Beloit College

Abstract

BackgroundMassively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. The large and complex genomes of pines (Pinus spp.) have hindered the development of genomic resources, despite the ecological and economical importance of the group. While most genomic studies have focused on a single species (P. taeda), genomic level resources for other pines are insufficiently developed to facilitate ecological genomic research. Lodgepole pine (P. contorta) is an ecologically important foundation species of montane forest ecosystems and exhibits substantial adaptive variation across its range in western North America. Here we describe a sequencing study of expressed genes from P. contorta, including their assembly and annotation, and their potential for molecular marker development to support population and association genetic studies.ResultsWe obtained 586,732 sequencing reads from a 454 GS XLR70 Titanium pyrosequencer (mean length: 306 base pairs). A combination of reference-based and de novo assemblies yielded 63,657 contigs, with 239,793 reads remaining as singletons. Based on sequence similarity with known proteins, these sequences represent approximately 17,000 unique genes, many of which are well covered by contig sequences. This sequence collection also included a surprisingly large number of retrotransposon sequences, suggesting that they are highly transcriptionally active in the tissues we sampled. We located and characterized thousands of simple sequence repeats and single nucleotide polymorphisms as potential molecular markers in our assembled and annotated sequences. High quality PCR primers were designed for a substantial number of the SSR loci, and a large number of these were amplified successfully in initial screening.ConclusionsThis sequence collection represents a major genomic resource for P. contorta, and the large number of genetic markers characterized should contribute to future research in this and other pines. Our results illustrate the utility of next generation sequencing as a basis for marker development and population genomics in non-model species.

Highlights

Parallel sequencing of cDNA is an efficient route for generating enormous sequence collections that represent expressed genes
454 sequencing and assembly We created a normalized cDNA pool based on RNA extracted from needles and developing conelets that were sampled from four individual P. contorta trees in the Medicine Bow National Forest in Wyoming
A large number of sequences (10.4%) were most similar to fungal proteins (Table 3), likely indicating the presence of endophytic fungi in our sampled tissues. This seemingly low percentage of Expressed Sequence Tag (EST) with BLAST hits is partially due to a high frequency of short sequences in our ESTs, annotation of only 30-40% of sequences is common in analyses of large EST collections [5,16,38]

Summary

Introduction

Parallel sequencing of cDNA is an efficient route for generating enormous sequence collections that represent expressed genes This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. Transcriptome, or Expressed Sequence Tag (EST), sequencing is an efficient means to generate functional genomic level data for non-model organisms or those with genome characteristics prohibitive to whole genome sequencing. Large collections of EST sequences have proven invaluable for gene annotation and discovery [2,4], comparative genomics [5], development of molecular markers [6,7], and for population genomic studies of genetic variation associated with adaptive traits [8]. Until recently, traditional laboratory methods for the development of EST resources have required costly and time consuming approaches involving cloning, cDNA library construction, and many labor intensive Sanger sequencing runs [2]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Opportunities for unlocking the potential of genomics for African trees.
Barnabas H Daru ... Abraham E Van Wyk
The New phytologist | VOL. 210
Barnabas H Daru, et. al.Barnabas H Daru ... Abraham E Van Wyk
22 Dec 2015
The New phytologist | VOL. 210

Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak.
...
BMC Genomics | VOL. 11
, et. al. ...
23 Nov 2010
BMC Genomics | VOL. 11

Candidate adaptive genes associated with lineage divergence: identifying SNPs via next-generation targeted resequencing in mule deer (Odocoileus hemionus).
John H Powell ... Gordon Luikart
Molecular ecology resources | VOL. 16
John H Powell, et. al.John H Powell ... Gordon Luikart
12 Aug 2016
Molecular ecology resources | VOL. 16

Population genomics informs conservation and management of the Galapagos shark (Carcharhinus galapagensis) at local, regional and oceanic scales
...
-
, et. al. ...
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics