A Frequent Pattern Mining Method for Finding Planted Motifs of Unknown Length in DNA Sequences

Caiyan Jia,Lusheng Chen,Ruqian Lu

doi:10.2991/ijcis.2011.4.5.26

Abstract

Identification and characterization of gene regulatory binding motifs is one of the fundamental tasks toward systematically understanding the molecular mechanisms of transcriptional regulation. Recently, the problem has been abstracted as the challenge planted (l,d)-motif problem. Previous studies have developed numerous methods to solve the problem. But most of them need to specify the length l of a planted motif in advance and use depth first search strategy. In this study, we present an exact and efficient algorithm, called Apriori-Motif, without given the length l of a planted motif a priori. And a breadth first search strategy is used to prune search space quickly by the downward closure property utilized in Apriori, which is a classical algorithm for frequent pattern mining. Empirical stu better than some existing methods.

Highlights

In the post-genomic era, a major challenge is represented by deciphering expression regulation of thousands of annotated genes in genomes
We test the performance of Apriori-Motif on some benchmark synthetic samples for PMP under the conditions which we are concerned with
A parent motif of length l is chosen by picking l bases from nucleotides A, C, G, T at random

Summary

Introduction

In the post-genomic era, a major challenge is represented by deciphering expression regulation of thousands of annotated genes in genomes. A signal (often called motif) in DNA sequences is not exactly identical but presents mutations This signal is a short subsequence, typically about 10 bp (base pairs) long, in the midst of a great amount of statistical noise and is too complicated to be discriminated by computational methods. The kind of algorithms are based on PWM (Position-Specific Weighted Matrix, called profile) model, can find motifs of any specified length very efficiently. They are inevitably to slump into a local optimal.

Preliminaries

Apriori-Motif

Complexity Analysis

Results and Discussions

Benchmark datasets

Comparison with some other algorithms

Discussions

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Frequent Pattern Mining Method for Finding Planted Motifs of Unknown Length in DNA Sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems

Lead the way for us

Journal: International Journal of Computational Intelligence Systems	Publication Date: Jan 1, 2011
License type: cc-by

Similar Papers

A Frequent Pattern Mining Method for Finding Planted (l, d)-motifs of Unknown Length
Caiyan Jia ... Lusheng Chen
-
Caiyan Jia, et. al.Caiyan Jia ... Lusheng Chen
01 Jan 2009
01 Jan 2009

Closed frequent similar pattern mining: Reducing the number of frequent similar patterns without information loss
Ansel Y Rodríguez-González ... Enrique Munoz De Cote
Expert systems with applications | VOL. 96
Ansel Y Rodríguez-González, et. al.Ansel Y Rodríguez-González ... Enrique Munoz De Cote
09 Dec 2017
Expert systems with applications | VOL. 96

Customized frequent patterns mining algorithms for enhanced Top-Rank-K frequent pattern mining
Areej Ahmad Abdelaal ... Mohammad Allaho
Expert systems with applications | VOL. 169
Areej Ahmad Abdelaal, et. al.Areej Ahmad Abdelaal ... Mohammad Allaho
24 Dec 2020
Expert systems with applications | VOL. 169

RP-Miner: a relaxed prune algorithm for frequent similar pattern mining
Ansel Yoan Rodríguez-González ... José Ruiz-Shulcloper
Knowledge and information systems | VOL. 27
Ansel Yoan Rodríguez-González, et. al.Ansel Yoan Rodríguez-González ... José Ruiz-Shulcloper
16 Jun 2010
Knowledge and information systems | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Frequent Pattern Mining Method for Finding Planted Motifs of Unknown Length in DNA Sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Computational Intelligence Systems