Abstract

Medium-resolution genotyping has the goal of distinguishing different subgroups instead of each element in a group. An oligonucleotide array provides an inexpensive, high-throughput method to identify differences in DNA sequence among individuals, which is fundamental for genotyping. As the cost and difficulty of designing and fabricating the oligonucleotide array dramatically increase with the number of probes used, it is therefore important to have a design with a minimum number of probes meeting the requirement of medium-resolution genotyping. The first algorithm for designing and selecting probes for oligonucleotide array-based medium-resolution typing is reported. The goal in deriving the algorithm was to select a minimum number of probes from a large probe set on the premise of minimum loss of resolution. The algorithm, which was based on entropy, conditional entropy and mutual information theory, was used to select the minimum number of probes from a large probe set. The algorithm was tested on a human leukocyte antigen (HLA) sequence data set Thirty probes were selected from 390 probes for HLA-A, and 60 probes were selected from 767 probes for HLA-B. Although the number of probes was reduced by almost ten times, the distinguishability was reduced only a little, by 0.45% (from 99.90% to 99.45%) for HLA-A and 0.27% (from 99.84% to 99.57%) for HLA-B, respectively. This is a satisfactory and practical result.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call