Abstract

The Human Genome Project requires better software for the creation of physical maps of chromosomes. Current mapping techniques involve breaking large segments of DNA into smaller, more-manageable pieces, gathering information on all the small pieces, and then constructing a map of the original large piece from the information about the small pieces. Unfortunately, in the process of breaking up the DNA some information is lost and noise of various types is introduced; in particular, the order of the pieces is not preserved. Thus, the map maker must solve a combinatorial problem in order to reconstruct the map. Good software is indispensable for quick, accurate reconstruction. The reconstruction is complicated by various experimental errors. A major source of difficulty—which sems to be inherent to the recombination technology—is the presence of chimeric DNA clones. It is fairly common for two disjoint DNA pieces to form a chimera, i.e. a fusion of two pieces which appears as a single piece. Attempts to order chimera will fail unless they are algorithmically divided into their constituent pieces. Despite consensus within the genomic mapping community of the critical importance of correcting chimerism, algorithms for solving the chimeric clone problem have received only passing attention in the literature. Based on a model proposed by Lander (1992a, b) this paper presents the first algorithms for analyzing chimerism. We construct physical maps in the presence of chimerism by creating optimization functions which have minimizations which correlate with map quality. Despite the fact that these optimization functions are invariably NP-complete our algorithms are guaranteed to produce solutions which are close to the optimum. The practical import of using these algorithms depends on the strength of the correlation of the function to the map quality as well as on the accuracy of the approximations. We employ two fundamentally different optimization functions as a means of avoiding biases likely to decorrelate the solutions from the desired map. Experiments on simulated data show that both our algorithm which minimizes the number of chimeric fragments in a solution and our algorithm which minimizes the maximum number of fragments per clone in a solution do, in fact, correlate to high quality solutions. Furthermore, tests on simulated data using parameters set to mimic real experiments show that the algorithms have the potential to find high quality solutions with real data. We plan to test our software against real data from the Whitehead Institute and from Los Alamos Genomic Research Center in the near future.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.