Abstract

Single nucleotide polymorphisms (SNPs), due to their abundance and low mutation rate, are very useful genetic markers for genetic association studies. However, the current genotyping technology cannot afford to genotype all common SNPs in all the genes. By making use of linkage disequilibrium, we can reduce the experiment cost by genotyping a subset of SNPs, called Tag SNPs, which have a strong association with the ungenotyped SNPs, while are as independent from each other as possible. The problem of selecting Tag SNPs is NP-complete; when there are large number of SNPs, in order to avoid extremely long computational time, most of the existing Tag SNP selection methods first partition the SNPs into blocks based on certain block definitions, then Tag SNPs are selected in each block by brute-force search. The size of the Tag SNP set obtained in this way may usually be reduced further due to the inter-dependency among blocks. This paper proposes two algorithms, TSSA and TSSD, to tackle the block-independent Tag SNP selection problem. TSSA is based on A* search algorithm, and TSSD is a heuristic algorithm. Experiments show that TSSA can find the optimal solutions for medium-sized problems in reasonable time, while TSSD can handle very large problems and report approximate solutions very close to the optimal ones.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call