Abstract

BackgroundSuppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process.ResultsForward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value < 0.05). Enrichment ratio 2 calculations showed that > 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and illustrated that the SSHscreen plots are a useful tool for choosing anonymous clones for sequencing, since redundant clones cluster together on the enrichment ratio plots.ConclusionsWe developed the SSHscreen-SSHdb software pipeline, which greatly facilitates gene discovery using suppression subtractive hybridization by improving the selection of clones for sequencing after screening the library on a small number of microarrays. Annotation of the sequence information and collaboration was further enhanced through a web-based SSHdb database, and we illustrated this through identification of drought responsive genes from cowpea, which can now be investigated in gene function studies. SSH is a popular and powerful gene discovery tool, and therefore this pipeline will have application for gene discovery in any biological system, particularly non-model organisms. SSHscreen 2.0.1 and a link to SSHdb are available from http://microarray.up.ac.za/SSHscreen.

Highlights

  • Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp)

  • Expressed sequence tag (EST) sequencing of cloned cDNAs is a common approach with the advantage that if full-length cDNAs are cloned they can be directly employed for further gene function experiments [1]

  • Construction of cowpea drought expression suppression subtractive hybridization (SSH) library and overview of SSHscreen/SSHdb data analysis pipeline We developed a pipeline for quantitative screening and sequence management of clones from a SSH cDNA library

Read more

Summary

Introduction

Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). Cloned cDNAs can be arrayed on high-density microarrays and used for expression profiling [2] Generation sequencing, such as 454 technologyTM, has been employed for sequencing cDNA libraries [3], and the term RNA-Seq has been dubbed for this approach when applied at deep enough coverage to compare transcript counts between one or more biological states [4]. Previous methods, such as serial analysis of gene expression (SAGE), are based on counting short sequence tags [5]. They are most effective if an annotated genome sequence is available

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call