Abstract

Biochemical analysis procedures, that include genomics and drug discovery, have been formalized to an extent that they can be automated. Large Microarrays housing DNA probes are used for this purpose. Manufacturing these microarrays involve depositing the respective DNA probes in each of its cells. The deposition is carried out iteratively by masking and unmasking cells in each step. A masked cell of the microarray that is adjacent (shares a border) to an unmasked one is at a high risk of being exposed in a deposition step. Thus, minimizing the number of such borders (Border length minimization) is crucial for reliable manufacturing of these microarrays. Given a microarray and a set of DNA probes, computing a lower bound on the border length is crucial to study the effectiveness of any algorithm that solves the border length minimization problem. A Numerical method for computing this lower bound has been proposed in the literature. This takes prohibitively large time. In practice, the DNA probes are random sequences of nucleotides. Based on this realistic assumption, this paper attempts to estimate the lower bound for the border length analytically using a probability theoretic approach by reducing the same to the problems of computing the probability distribution functions (PDF) for the Hamming Distance and the length of the longest common subsequence (LCS) between two random strings. To the best of our knowledge, no PDF is reported earlier for the length of the LCS between two random strings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call