Abstract
BackgroundMost evolutionary developmental biology ("evo-devo") studies of emerging model organisms focus on small numbers of candidate genes cloned individually using degenerate PCR. However, newly available sequencing technologies such as 454 pyrosequencing have recently begun to allow for massive gene discovery in animals without sequenced genomes. Within insects, although large volumes of sequence data are available for holometabolous insects, developmental studies of basally branching hemimetabolous insects typically suffer from low rates of gene discovery.ResultsWe used 454 pyrosequencing to sequence over 500 million bases of cDNA from the ovaries and embryos of the milkweed bug Oncopeltus fasciatus, which lacks a sequenced genome. This indirectly developing insect occupies an important phylogenetic position, branching basal to Diptera (including fruit flies) and Hymenoptera (including honeybees), and is an experimentally tractable model for short-germ development. 2,087,410 reads from both normalized and non-normalized cDNA assembled into 21,097 sequences (isotigs) and 112,531 singletons. The assembled sequences fell into 16,617 unique gene models, and included predictions of splicing isoforms, which we examined experimentally. Discovery of new genes plateaued after assembly of ~1.5 million reads, suggesting that we have sequenced nearly all transcripts present in the cDNA sampled. Many transcripts have been assembled at close to full length, and there is a net gain of sequence data for over half of the pre-existing O. fasciatus accessions for developmental genes in GenBank. We identified 10,775 unique genes, including members of all major conserved metazoan signaling pathways and genes involved in several major categories of early developmental processes. We also specifically address the effects of cDNA normalization on gene discovery in de novo transcriptome analyses.ConclusionsOur sequencing, assembly and annotation framework provide a simple and effective way to achieve high-throughput gene discovery for organisms lacking a sequenced genome. These data will have applications to the study of the evolution of arthropod genes and genetic pathways, and to the wider evolution, development and genomics communities working with emerging model organisms.[The sequence data from this study have been submitted to GenBank under study accession number SRP002610 (http://www.ncbi.nlm.nih.gov/sra?term=SRP002610). Custom scripts generated are available at http://www.extavourlab.com/protocols/index.html. Seven Additional files are available.]
Highlights
Most evolutionary developmental biology ("evo-devo”) studies of emerging model organisms focus on small numbers of candidate genes cloned individually using degenerate PCR
Many researchers have highlighted the need for developing new model organisms for specific comparative, evolutionary and ecological questions [6,7,8]. It has been suggested, that the single gene expression approach of the last several decades of evolutionary developmental biology ("evo-devo”) has outlived its usefulness, and that what are needed are not more model organisms, but rather a smaller number of groups chosen for the ability to functionally manipulate genes [9,10]
Genomic sequence data will be necessary in the future for linkage or cis-regulatory analyses, at the early stages of establishing new model organisms, one of the most important goals is often gene discovery
Summary
Most evolutionary developmental biology ("evo-devo”) studies of emerging model organisms focus on small numbers of candidate genes cloned individually using degenerate PCR. While studying a huge diversity of animals has long been the norm in the classical fields of experimental embryology and highlighted the need for developing new model organisms for specific comparative, evolutionary and ecological questions [6,7,8] It has been suggested, that the single gene expression approach of the last several decades of evolutionary developmental biology ("evo-devo”) has outlived its usefulness, and that what are needed are not more model organisms, but rather a smaller number of groups chosen for the ability to functionally manipulate genes [9,10]. Most non-traditional organism studies are still subject to a gene discovery bottleneck This is largely because at the scale needed to uncover rare developmental transcripts, Sanger-based EST sequencing quickly becomes technically and financially prohibitive for many labs working on organisms with smaller research communities. Those smaller-scale EST projects that have been carried out are often not publically available in searchable formats, and their potential contribution to the developmental and evolutionary biology fields is limited
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.