Abstract

BackgroundThe discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize.ResultsA benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter motifs in their promoters, perhaps uncovering a broader co-regulated gene network. Promzea was also tested against tissue-specific microarray data from maize.ConclusionsAn online tool customized for promoter motif discovery in plants has been generated called Promzea. Promzea was validated in silico by its ability to retrieve benchmark motifs and experimentally defined motifs and was tested using tissue-specific microarray data. Promzea predicted broader networks of gene regulation associated with the historic anthocyanin and phlobaphene biosynthetic pathways. Promzea is a new bioinformatics tool for understanding transcriptional gene regulation in maize and has been expanded to include rice and Arabidopsis.

Highlights

  • The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies

  • Unfiltered Multiple Expectation-maximization for Motif Elicitation (MEME) predicted an average of 1145 Nucleotide true positive (nTP) correctly, and 29,982 Nucleotide false positive (nFP)

  • Filtering each motif discovery program separately before combining the results reduced the average nFPs by 25.7% compared to the combined unfiltered data yet only reduced nTPs by 8.7% (Figure 3A-C, Table 2)

Read more

Summary

Introduction

The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The genome sequence of maize (Zea mays L.) was released [3], facilitating searches for cis-acting motifs in one of the world’s most important crops. Used software programs that apply a probabilistic algorithm are BioProspector [10] and MEME (Multiple Expectation-maximization for Motif Elicitation) [11]. BioProspector uses Gibbs sampling [12] which randomly picks subsequences of a defined length and iteratively searches within input promoters until a high probability match is found, defined as having PWM values that are significantly different from the input background sequences. The sub-segment with the highest probability after EM is chosen and modified by iterating the EM algorithm until a candidate motif cannot be improved [11]

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call