QPMS9: an efficient algorithm for quorum Planted Motif Search.

Marius Nicolae,Sanguthevar Rajasekaran

doi:10.1038/srep07813

Marius Nicolae, Sanguthevar Rajasekaran

Open Access

https://doi.org/10.1038/srep07813

Copy DOI

Journal: Scientific Reports	Publication Date: Jan 15, 2015
Citations: 34	License type: CC BY 4.0

Affiliation: University of Connecticut

Abstract

Discovering patterns in biological sequences is a crucial problem. For example, the identification of patterns in DNA sequences has resulted in the determination of open reading frames, identification of gene promoter elements, intron/exon splicing sites, and SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have led to domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, discovery of short functional motifs, etc. In this paper we focus on the identification of an important class of patterns, namely, motifs. We study the (ℓ, d) motif search problem or Planted Motif Search (PMS). PMS receives as input n strings and two integers ℓ and d. It returns all sequences M of length ℓ that occur in each input string, where each occurrence differs from M in at most d positions. Another formulation is quorum PMS (qPMS), where the motif appears in at least q% of the strings. We introduce qPMS9, a parallel exact qPMS algorithm that offers significant runtime improvements on DNA and protein datasets. qPMS9 solves the challenging DNA (ℓ, d)-instances (28, 12) and (30, 13). The source code is available at https://code.google.com/p/qpms9/.

Highlights

Correspondence and Discovering patterns in biological sequences is a crucial problem
We study the (, d) motif search problem or Planted Motif Search (PMS)
It returns all sequences M of length, that occur in each input string, where each occurrence differs from M in at most d positions

Summary

Introduction

Correspondence and Discovering patterns in biological sequences is a crucial problem. It returns all possible biological sequences M of length , such that M occurs in each of the input strings, and each occurrence differs from M in at most d positions Buhler and Tompa have employed PMS algorithms to find known transcriptional regulatory elements upstream of several eukaryotic genes They have used orthologous sequences from different organisms upstream of four different genes: preproinsulin, dihydrofolate reductase (DHFR), metallothioneins, and cfos. They have employed the upstream regions involved in purine metabolism from three Pyrococcus genomes.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

QPMS9: an efficient algorithm for quorum Planted Motif Search.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Corrigendum: qPMS9: an efficient algorithm for quorum planted motif search.
Marius Nicolae ... Sanguthevar Rajasekaran
Scientific Reports | VOL. 5
Marius Nicolae, et. al.Marius Nicolae ... Sanguthevar Rajasekaran
27 Mar 2015
Scientific Reports | VOL. 5

A speedup technique for (l, d)-motif finding algorithms.
Sanguthevar Rajasekaran ... Hieu Dinh
BMC Research Notes | VOL. 4
Sanguthevar Rajasekaran, et. al.Sanguthevar Rajasekaran ... Hieu Dinh
08 Mar 2011
BMC Research Notes | VOL. 4

Trie-PMS8: A trie-tree based robust solution for planted motif search problem
Mohammad Hasan ... Mahmudul Alam
International Journal of Cognitive Computing in Engineering | VOL. 5
Mohammad Hasan, et. al.Mohammad Hasan ... Mahmudul Alam
01 Jan 2024
International Journal of Cognitive Computing in Engineering | VOL. 5

QPMS10: A randomized algorithm for efficiently solving quorum Planted Motif Search problem
Peng Xiao ... Sanguthevar Rajasekaran
-
Peng Xiao, et. al.Peng Xiao ... Sanguthevar Rajasekaran
01 Dec 2016
01 Dec 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

QPMS9: an efficient algorithm for quorum Planted Motif Search.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports