Analysis of concordance of different haplotype block partitioning algorithms.

Amit R Indap,Michael Olivier,Craig A Struble,Gabor T Marth,Peter Tonellato

doi:10.1186/1471-2105-6-303

Amit R Indap, Michael Olivier + Show 3 more

Open Access

https://doi.org/10.1186/1471-2105-6-303

Copy DOI

Abstract

BackgroundDifferent classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency.ResultsWe simulated 1000 haplotypes using the standard coalescent for three world populations – European, African American, and East Asian – and applied three classes of block partitioning algorithms – diversity based, LD based, and information theoretic. We assessed algorithm differences in number, size, and coverage of blocks inferred under different conditions of SNP density, allele frequency, and sample size.Each algorithm inferred blocks differing in number, size, and coverage under different density and allele frequency conditions. Different partitions had few if any matching block boundaries. However they still overlapped and a high percentage of total chromosomal region was common to all methods. This percentage was generally higher with a higher density of SNPs and when rarer markers were included.ConclusionA gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease.

Highlights

Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population
A gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease
Data simulation and block partitioning One thousand haplotypes representing a 200 kb region were generated via the standard coalescent with population specific demographic profiles for three world populations: European, African American, and East Asian

Summary

Introduction

Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. We performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. Association studies work on the premise that SNP genotypes are correlated with a disease phenotype. SNPs that are in LD with causative allele serve as a proxy and the association with the disease phenotype is maintained

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Dec 1, 2005
Citations: 18	License type: cc-by

R Discovery Prime

R Discovery Prime

Analysis of concordance of different haplotype block partitioning algorithms.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Platelet glycoprotein I(b)alpha and integrin alpha2 beta1 polymorphisms: gene frequencies and linkage disequilibrium in a population diversity panel.
J Di Paola ... J.C Murray
Journal of Thrombosis and Haemostasis | VOL. 3
J Di Paola, et. al.J Di Paola ... J.C Murray
11 Apr 2005
Journal of Thrombosis and Haemostasis | VOL. 3

Fish scales and SNP chips: SNP genotyping and allele frequency estimation in individual and pooled DNA from historical samples of Atlantic salmon (Salmo salar)
Susan E Johnston ... Eero Niemelä
BMC Genomics | VOL. 14
Susan E Johnston, et. al.Susan E Johnston ... Eero Niemelä
01 Jan 2013
BMC Genomics | VOL. 14

The Effect of Single-Nucleotide Polymorphism Marker Selection on Patterns of Haplotype Blocks and Haplotype Frequency Estimates
Michael Nothnagel ... Klaus Rohde
The American Journal of Human Genetics | VOL. 77
Michael Nothnagel, et. al.Michael Nothnagel ... Klaus Rohde
01 Dec 2005
The American Journal of Human Genetics | VOL. 77

Linkage disequilibrium patterns of the human genome across populations.
S Shifman
Human Molecular Genetics | VOL. 12
S ShifmanS Shifman
01 Apr 2003
Human Molecular Genetics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of concordance of different haplotype block partitioning algorithms.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics