A haplotype inference method based on sparsely connected multi-body ising model

Masashi Kato,Hiroshi Chigira,Masato Inoue,Qian Ji Gao,Hiroyuki Shindo

doi:10.1088/1742-6596/233/1/012022

Masashi Kato, Hiroshi Chigira + Show 3 more

Open Access

https://doi.org/10.1088/1742-6596/233/1/012022

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Jun 1, 2010
Citations: 1	License type: cc-iop-open

Affiliation: Waseda University

Abstract

Statistical haplotype inference is an indispensable technique in the field of medical science. The method usually has two steps: inference of haplotype frequencies and inference of diplotype for each subject. The first step can be done by using the expectation-maximization (EM) algorithm, but it incurs an unreasonably large calculation cost when the number of single-nucleotide polymorphism (SNP) loci of concern is large. In this article, we describe an approximate probabilistic model of haplotype frequencies. The model is constructed by using several distributions of nearby local SNPs. This approximation seems good because SNPs are generally more strongly correlated when they are close to one another on a chromosome. To implement this approach, we use a log linear model, the Walsh-Hadamard transform, and a combinatorial optimization method. Artificial data suggested that the overall haplotype inference of our method is good if there are nine or more local consecutive SNPs. Some minor problems should be dealt with before this method can be applied to real data.

Full Text