Abstract

We present a Bayesian motif discovery (BMD) algorithm for detecting an unknown number of instances of a motif in a given set of sequences. The algorithm models a motif with a position weight matrix (PWM), which is estimated along with the motif discovery process. This technique is flexible enough to enable other discovery algorithms' results to be used as input. The method is based on a sequential Monte Carlo algorithm, where the state to be estimated consists of the number of instances in each sequence and their initial positions. The accuracy of the proposed method is compared with other profile-based discovery algorithms. BMD is shown to perform statistically better than MEME and BioProspector in applications ranging from synthetic data to genomic motif finding of Din serine recombinases. In the case of site-specific recombinase target discovery, BMD-inferred motif is found to be the only functionally accurate from the underlying biochemical mechanism standpoint.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call