We study the problem of constructing a most compact physical map for a collection of clones whose ordering or interleaving on a DNA molecule are given. Each clone is a contiguous section of the DNA and is represented by its fingerprint obtained from biochemical experiments. In this paper we consider two kinds of mapping: single complete digest mapping, in which the fingerprint of a clone is a multiset containing the sizes of the restriction fragments occurring in the clone, and mapping by hybridization of probes, in which the fingerprint of a clone is a multiset consisting of the short oligonucleotide probes occurring in the clone. Our goal is to position the clones and restriction fragments (or probes) on the DNA consistently with the given ordering or interleaving so that the total number of restriction fragments (resp. probes) required on the DNA is minimized. We first formulate this as a constrained path cover problem on a multistage graph. Using this formulation, it is shown that finding a most compact map for clones with a given ordering is NP-hard. The approximability of the problem is then considered. We present a simple approximation algorithm with ratio 2 . This is in fact the best possible as the above NP-hardness proof actually shows that achieving ratio 2-e is impossible for any constant e > 0 , unless P = NP. We also give a polynomial-time approximation scheme when the multiplicity is bounded by one (i.e., when the multisets are actually sets). The exact complexity of the problem in this special case is presently unknown. Finally we consider the mapping problem when an interleaving is given which depicts how the clones overlap with each other on the DNA. In the case of restriction fragment data, it is shown that finding a consistent map is NP-complete even if the multiplicity is bounded by 3 . This may suggest that information about the interleaving of clones does not necessarily make the problem computationally easier in single complete digest mapping. On the other hand, in the case of hybridization data, there is an efficient algorithm to construct a most compact map when the interleaving of clones is given.
Read full abstract