A Cluster Refinement Algorithm for Motif Discovery

Gang Li,Kin-Hong Lee,Kwong-Sak Leung,Tak-Ming Chan

doi:10.1109/tcbb.2009.25

Abstract

Finding Transcription Factor Binding Sites, i.e., motif discovery, is crucial for understanding the gene regulatory relationship. Motifs are weakly conserved and motif discovery is an NP-hard problem. We propose a new approach called Cluster Refinement Algorithm for Motif Discovery (CRMD). CRMD employs a flexible statistical motif model allowing a variable number of motifs and motif instances. CRMD first uses a novel entropy-based clustering to find complete and good starting candidate motifs from the DNA sequences. CRMD then employs an effective greedy refinement to search for optimal motifs from the candidate motifs. The refinement is fast, and it changes the number of motif instances based on the adaptive thresholds. The performance of CRMD is further enhanced if the problem has one occurrence of motif instance per sequence. Using an appropriate similarity test of motifs, CRMD is also able to find multiple motifs. CRMD has been tested extensively on synthetic and real data sets. The experimental results verify that CRMD usually outperforms four other state-of-the-art algorithms in terms of the qualities of the solutions with competitive computing time. It finds a good balance between finding true motif instances and screening false motif instances, and is robust on problems of various levels of difficulty.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Cluster Refinement Algorithm for Motif Discovery

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics

Lead the way for us

Journal: IEEE/ACM Transactions on Computational Biology and Bioinformatics	Publication Date: Oct 1, 2010
Citations: 46

Similar Papers

A fast weak motif-finding algorithm based on community detection in graphs.
Caiyan Jia ... Jian Yu
BMC Bioinformatics | VOL. 14
Caiyan Jia, et. al.Caiyan Jia ... Jian Yu
17 Jul 2013
BMC Bioinformatics | VOL. 14

An efficient implementation of EMD algorithm for motif discovery in time series data
Duong Tuan Anh ... Nguyen Van Nhat
International Journal of Data Mining, Modelling and Management | VOL. 8
Duong Tuan Anh, et. al.Duong Tuan Anh ... Nguyen Van Nhat
01 Jan 2015
International Journal of Data Mining, Modelling and Management | VOL. 8

An Estimation of Distribution Algorithm for Motif Discovery
Gang Li ... Kin-Hong Lee
-
Gang Li, et. al. Gang Li ... Kin-Hong Lee
01 Jun 2008
01 Jun 2008

Graphical approach to weak motif recognition.
Jagath C Rajapakse ... Xiao Yang
Genome Informatics | VOL. 15
Jagath C Rajapakse, et. al.Jagath C Rajapakse ... Xiao Yang
11 Jul 2011
Genome Informatics | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Cluster Refinement Algorithm for Motif Discovery

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics