Abstract

BackgroundAccurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription. Composite regulatory elements represent a particular type of such transcriptional regulatory elements consisting of pairs of individual DNA motifs. In contrast to the present approach, most available recognition techniques are based purely on statistical evaluation of the occurrence of single motifs. Such methods are limited in application, since the accuracy of recognition is greatly dependent on the size and quality of the sequence dataset. Methods that exploit available knowledge and have broad applicability are evidently needed.ResultsWe developed a novel method to identify composite regulatory elements in promoters using a library of known examples. In depth investigation of regularities encoded in known composite elements allowed us to introduce a new characteristic measure and to improve the specificity compared with other methods. Tests on an established benchmark and real genomic data show that our method outperforms other available methods based either on known examples or statistical evaluations. In addition to better recognition, a practical advantage of this method is first the ability to detect a high number of different types of composite elements, and second direct biological interpretation of the identified results. The program is available at http://gnaweb.helmholtz-hzi.de/cgi-bin/MCatch/MatrixCatch.pl and includes an option to extend the provided library by user supplied data.ConclusionsThe novel algorithm for the identification of composite regulatory elements presented in this paper was proved to be superior to existing methods. Its application to tissue specific promoters identified several highly specific composite elements with relevance to their biological function. This approach together with other methods will further advance the understanding of transcriptional regulation of genes.

Highlights

  • Accurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription

  • By increasing the number of allowed nucleotide mismatches in both motifs and the distance between them the accuracy of the method can be adjusted. Another method was developed for the recognition of composite element NF-AT/AP-1 [4] with a score function based on weighted logarithms of position weight matrices (PWMs) scores and a fixed length of intermediate sequence from 5 to 11bp

  • All three methods were tested on the same dataset by the same procedure

Read more

Summary

Introduction

Accurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription. In contrast to the present approach, most available recognition techniques are based purely on statistical evaluation of the occurrence of single motifs Such methods are limited in application, since the accuracy of recognition is greatly dependent on the size and quality of the sequence dataset. Structure and primary sequence of CEs are studied in a number of different experiments, in particular, to confirm protein-protein interactions and cooperative binding to DNA, as well as effects on transcriptional regulation. Such data on CEs can be found in databases such as TRANSCompel [9]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call