Abstract
Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana.Database URL: http://www.pathoplant.de/expression_analysis.php.
Highlights
Eukaryotic gene expression is largely regulated by the binding of transcription factors (TFs) to cis-sequences in promoter regions
These were identified in the A. thaliana genome and annotated to the AthaMap database, which is linked to PathoPlant [13, 33, 34]
Cis-sequences that occur in promoters were associated with specific gene expression profiles
Summary
Eukaryotic gene expression is largely regulated by the binding of transcription factors (TFs) to cis-sequences in promoter regions. One way to identify the conditions upon which a cissequence may confer gene expression is to analyse if genes harbouring this sequence in their promoter show a specific expression profile under certain environmental conditions [7] This has been done for many sequences predicting known and novel expression profiles [8,9,10]. This shows that such an approach is a useful way to identify the possible function of specific cis-sequences For such predictions, it may be helpful to have an online web tool that permits such an analysis for any given cis-sequence. The database is manually annotated with data from the literature It contains data for 99 plant species and varieties, 107 pathogens and 638 molecules from 619 references [14]. 144 different microarray data sets from Arabidopsis thaliana, corresponding to 36 different abiotic and biotic stimuli, have been annotated to PathoPlant
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have