An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance

Ana C Casimiro,Susana Vinga,Arlindo L Oliveira,Ana T Freitas

doi:10.1186/1471-2105-9-89

Abstract

BackgroundMotif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially.ResultsWe propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery.ConclusionWe conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.

Highlights

Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences
In this article we propose to integrate position uniformity tests and over-representation tests based on Markov models to improve the posterior classification of the motifs and better assess their biological significance
We propose that the integration of position uniformity tests and over-representation tests can be used to improve the accuracy of the classification of motifs found by combinatorial motif finders

Summary

Introduction

Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. The computational analysis of DNA sequences represents a major endeavor in the post-genomic era. The increasing number of whole-genome sequencing projects has provided an enormous amount of information which leads to the need of new tools and string processing algorithms to analyze and classify the obtained sequences [1]. In this regard, the study of short functional DNA segments, such as transcriptional factor binding sites, has emerged as an important effort to understand key control mechanisms. DNA motifs can be represented in a number of different ways. Our approach is independent of the way motifs are modeled, since it requires only the list of occurrences of motifs, something that can be obtained from any motif representation

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Feb 7, 2008
Citations: 44	License type: cc-by

R Discovery Prime

R Discovery Prime

An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Finding Possible Promoter Binding Sites in DNA Sequences by Sequential Patterns Mining With Specific Numbers of Gaps.
Yu-Hao Ke ... Wei-Chen Lin
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 18
Yu-Hao Ke, et. al.Yu-Hao Ke ... Wei-Chen Lin
13 Mar 2020
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 18

Author response: PI3K signaling specifies proximal-distal fate by driving a developmental gene regulatory network in SOX9+ mouse lung progenitors
Sharlene Fernandes ... Suchi Singh Jain
-
Sharlene Fernandes, et. al.Sharlene Fernandes ... Suchi Singh Jain
14 Jun 2022
14 Jun 2022

Detecting periodic patterns in biological sequences.
E Coward ... F Drabl√∏S
Bioinformatics (Oxford, England) | VOL. 14
E Coward, et. al.E Coward ... F Drabl√∏S
01 Jan 1998
Bioinformatics (Oxford, England) | VOL. 14

Deciphering principles of transcription regulation in eukaryotic genomes
Dat H Nguyen ... Patrik D'Haeseleer
Molecular Systems Biology | VOL. 2
Dat H Nguyen, et. al.Dat H Nguyen ... Patrik D'Haeseleer
01 Jan 2006
Molecular Systems Biology | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics