Abstract
Single nucleotide polymorphisms (SNPs) are the most frequently occurring genetic variations. Biologists use identified SNPs to investigate genetic diseases and heredity markers. They are also used to prevent side effects of medication. Thus, SNPs play an important role in personalized medicine. However, many association studies provide only the relationship among SNPs, diseases and cancers, without giving an SNP ID. In order to identify SNPs in a sequence, this research built dbSNP, SNP fasta and SNP flanking marker databases for the rat, mouse and human genome from the NCBI database. The proposed method utilizes SNP flanking markers that are extracted from a SNP fasta sequence and combines a Boyer–Moore algorithm with a dynamic programming method. The Boyer–Moore algorithm helps to select possible SNPs from the SNP fasta database using unknown sequences, and the dynamic programming method will then validate these SNPs. This method is very reliable retrieving SNP IDs from an unknown sequence. The experimental results show that this method is indeed able to determine exact SNP IDs from a sequence. It constitutes a novel application for the identification of SNP IDs from the literature and can be used in systematic association studies.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have