Improving cis-regulatory elements modeling by consensus scaffolded mixture models

Hongshan Jiang,Weimin Zheng,Xuegong Zhang,Wenguang Chen,Ying Zhao

doi:10.1007/s11432-011-4374-9

Abstract

A position weight matrix (PWM) is widely accepted as a probabilistic representation for modeling protein-DNA binding specificity. Previous studies showed that for factors which bind to divergent binding sites, mixtures of multiple PWMs improve performance. We propose a consensus scaffolded mixutre PWM (CSM) model to improve cis-regulatory elements modeling by allowing overlapping components represented by a set of PWMs, each of which corresponds to a binding pattern and is scaffolded by a degenerate consensus. In addition, we propose a learning algorithm that involves an initial structure learning stage based on the frequent pattern mining and a refining stage based on the expectation maximization (EM) algorithm. We assess the merits of CSM using three independent criteria. In a case-study of transcription factor Leu3, the derived CSM models agree with conventional mixtures but show better fitness according to Fermi-Dirac distribution. Analysis of the human-mouse conservation of predicted binding sites of 83 JASPAR transcription factors (TFs) shows that the CSM is as good as or better than the simple mixture, the context-specific independent (CSI) mixture, and the single PWM model, for 83%, 84%, and 75% of the cases, respectively. Five-fold cross validation on 46 TRANSFAC datasets shows that CSM model has better generality than other mixture models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving cis-regulatory elements modeling by consensus scaffolded mixture models

Abstract

Talk to us

Similar Papers

More From: Science China Information Sciences

Lead the way for us

Similar Papers

Genetic algorithm and optimized weight matrix application for peroxisome proliferator response elements recognition: Prerequisites of accuracy growth for wide genome research
Victor Levitsky ... Elena Ignatieva
Intelligent Data Analysis | VOL. 12
Victor Levitsky, et. al.Victor Levitsky ... Elena Ignatieva
21 Oct 2008
Intelligent Data Analysis | VOL. 12

Context-specific independence mixture modeling for positional weight matrices
Benjamin Georgi ... Alexander Schliep
Bioinformatics | VOL. 22
Benjamin Georgi, et. al.Benjamin Georgi ... Alexander Schliep
15 Jul 2006
Bioinformatics | VOL. 22

Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.
Ralf Eggeling ... Ivo Grosse
BMC Bioinformatics | VOL. 16
Ralf Eggeling, et. al.Ralf Eggeling ... Ivo Grosse
09 Nov 2015
BMC Bioinformatics | VOL. 16

Motif models proposing independent and interdependent impacts of nucleotides are related to high and low affinity transcription factor binding sites in Arabidopsis
Anton V Tsukanov ... Victor G Levitsky
Frontiers in Plant Science | VOL. 13
Anton V Tsukanov, et. al.Anton V Tsukanov ... Victor G Levitsky
28 Jul 2022
Frontiers in Plant Science | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving cis-regulatory elements modeling by consensus scaffolded mixture models

Abstract

Talk to us

Similar Papers

More From: Science China Information Sciences