Abstract

Single-nucleotide polymorphisms (SNPs) are vital in identifying genetic level variations in complex disease. It was found that the information of SNPs on adjacent or identical genes can be represented by a few tagSNPs (called tag SNP-set or tagSNP-set). In this work, we propose a novel method called TagSNP-set Selection by Optimal Iteration with Linkage Disequilibrium (TSOILD) and develop a quantificationally analytical tagSNP-set prediction method called Physical Distance-Linkage Disequilibrium Prediction Method (PDLDPM). To verify the validity of TSOILD method and PDLDPM, a large amount of test data is generated by simulation software HAPGEN2. According to the experimental results, the prediction accuracy of TSOILD is improved by 6.73%, 3.19%, 6.52% and 1.72% over the Random Sampling, Genetic Algorithm (GA) , Greedy Algorithm and TagSNP-Set Selection Method with Maximum Information (TSMI) respectively. In addition, PDLDPM, Linkage Coverage and selection of tag SNPs to maximize prediction accuracy (STAMPA) are used to evaluate the tagSNP-set selected by Random Sampling, GA, Greedy Algorithm and TSMI. Results show that the PDLDPM performs better than the other two methods. These methods provide effective assistance for the study of genetic level variation of complex diseases.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.