Computationally Tractable Multivariate HMM in Genome-Wide Mapping Studies.

Hyungwon Choi,Debashis Ghosh,Zhaohui Qin

doi:10.1007/978-1-4939-6753-7_10

Abstract

Hidden Markov model (HMM) is widely used for modeling spatially correlated genomic data (series data). In genomics, datasets of this kind are generated from genome-wide mapping studies through high-throughput methods such as chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq). When multiple regulatory protein binding sites or related epigenetic modifications are mapped simultaneously, the correlation between data series can be incorporated into the latent variable inference in a multivariate form of HMM, potentially increasing the statistical power of signal detection. In this chapter, we review the challenges of multivariate HMMs and propose a computationally tractable method called sparsely correlated HMMs (scHMM). We illustrate the method and the scHMM package using an example mouse ChIP-seq dataset.

Full Text