TOBFAC: the database of tobacco transcription factors

Paul J Rushton,Jennifer F Brannock,Marta T Bokowiec,Xianfeng Chen,Thomas W Laudeman,Michael P Timko

doi:10.1186/1471-2105-9-53

Paul J Rushton, Jennifer F Brannock + Show 4 more

Open Access

https://doi.org/10.1186/1471-2105-9-53

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jan 25, 2008
Citations: 92	License type: cc-by

Affiliation: University of Virginia

Abstract

BackgroundRegulation of gene expression at the level of transcription is a major control point in many biological processes. Transcription factors (TFs) can activate and/or repress the transcriptional rate of target genes and vascular plant genomes devote approximately 7% of their coding capacity to TFs. Global analysis of TFs has only been performed for three complete higher plant genomes – Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa) and rice (Oryza sativa). Presently, no large-scale analysis of TFs has been made from a member of the Solanaceae, one of the most important families of vascular plants. To fill this void, we have analysed tobacco (Nicotiana tabacum) TFs using a dataset of 1,159,022 gene-space sequence reads (GSRs) obtained by methylation filtering of the tobacco genome. An analytical pipeline was developed to isolate TF sequences from the GSR data set. This involved multiple (typically 10–15) independent searches with different versions of the TF family-defining domain(s) (normally the DNA-binding domain) followed by assembly into contigs and verification. Our analysis revealed that tobacco contains a minimum of 2,513 TFs representing all of the 64 well-characterised plant TF families. The number of TFs in tobacco is higher than previously reported for Arabidopsis and rice.ResultsTOBFAC: the database of tobacco transcription factors, is an integrative database that provides a portal to sequence and phylogeny data for the identified TFs, together with a large quantity of other data concerning TFs in tobacco. The database contains an individual page dedicated to each of the 64 TF families. These contain background information, domain architecture via Pfam links, a list of all sequences and an assessment of the minimum number of TFs in this family in tobacco. Downloadable phylogenetic trees of the major families are provided along with detailed information on the bioinformatic pipeline that was used to find all family members. TOBFAC also contains EST data, a list of published tobacco TFs and a list of papers concerning tobacco TFs. The sequences and annotation data are stored in relational tables using a PostgrelSQL relational database management system. The data processing and analysis pipelines used the Perl programming language. The web interface was implemented in JavaScript and Perl CGI running on an Apache web server. The computationally intensive data processing and analysis pipelines were run on an Apple XServe cluster with more than 20 nodes.ConclusionTOBFAC is an expandable knowledgebase of tobacco TFs with data currently available for over 2,513 TFs from 64 gene families. TOBFAC integrates available sequence information, phylogenetic analysis, and EST data with published reports on tobacco TF function. The database provides a major resource for the study of gene expression in tobacco and the Solanaceae and helps to fill a current gap in studies of TF families across the plant kingdom. TOBFAC is publicly accessible at .

Highlights

Regulation of gene expression at the level of transcription is a major control point in many biological processes
Our aim was to isolate every gene in all Transcription factors (TFs) gene families and we performed at least 5–10 independent searches for each gene family
Genome-related public databases are invaluable to the scientific community and transcriptional regulation of gene expression is a major control point in many biological processes

Summary

Introduction

Regulation of gene expression at the level of transcription is a major control point in many biological processes. No large-scale analysis of TFs has been made from a member of the Solanaceae, one of the most important families of vascular plants. To fill this void, we have analysed tobacco (Nicotiana tabacum) TFs using a dataset of 1,159,022 gene-space sequence reads (GSRs) obtained by methylation filtering of the tobacco genome. Tobacco [Nicotiana tabacum L.] is a member of the agriculturally important Solanaceae and is one of the most studied higher plant species. This is because of both its economic importance and because it is a convenient plant system for research. The one missing piece in the puzzle is the availability of the genome sequence of tobacco

Objectives

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TOBFAC: the database of tobacco transcription factors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences.
Xianfeng Chen ... Thomas A Spraggins
BMC Bioinformatics | VOL. 8
Xianfeng Chen, et. al.Xianfeng Chen ... Thomas A Spraggins
19 Apr 2007
BMC Bioinformatics | VOL. 8

Identification of early-responsive genes associated with the hypersensitive response to tobacco mosaic virus and characterization of a WRKY-type transcription factor in tobacco plants.
H Yoda ... N Koizumi
Molecular Genetics and Genomics | VOL. 267
H Yoda, et. al.H Yoda ... N Koizumi
09 Mar 2002
Molecular Genetics and Genomics | VOL. 267

The saga of leucine zippers continues: in response to oxidative stress.
Leo E Otterbein ... Augustine M K Choi
American journal of respiratory cell and molecular biology | VOL. 26
Leo E Otterbein, et. al.Leo E Otterbein ... Augustine M K Choi
01 Feb 2002
American journal of respiratory cell and molecular biology | VOL. 26

Genome-wide identification and expression analysis revealed key transcription factors as potential regulators of high-temperature adaptation of Coriolopsis trogii.
Lining Wang ... Zhihai Huang
Archives of microbiology | VOL. 206
Lining Wang, et. al.Lining Wang ... Zhihai Huang
19 Jul 2024
Archives of microbiology | VOL. 206

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TOBFAC: the database of tobacco transcription factors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics