Abstract

BackgroundSegmental duplications in genomes have been studied for many years. Recently, several studies have highlighted a biological phenomenon called breakpoint-duplication that apparently associates a significant proportion of segmental duplications in Mammals, and the Drosophila species group, to breakpoints in rearrangement events.ResultsIn this paper, we introduce and study a combinatorial problem, inspired from the breakpoint-duplication phenomenon, called the Genome Dedoubling Problem. It consists of finding a minimum length rearrangement scenario required to transform a genome with duplicated segments into a non-duplicated genome such that duplications are caused by rearrangement breakpoints. We show that the problem, in the Double-Cut-and-Join (DCJ) and the reversal rearrangement models, can be reduced to an APX-complete problem, and we provide algorithms for the Genome Dedoubling Problem with 2-approximable parts. We apply the methods for the reconstruction of a non-duplicated ancestor of Drosophila yakuba.ConclusionsWe present the Genome Dedoubling Problem, and describe two algorithms solving the problem in the DCJ model, and the reversal model. The usefulness of the problems and the methods are showed through an application to real Drosophila data.

Highlights

  • Gene duplication is an important source of variations in genomes

  • Later in [2], a study of all evolutionary rearrangement breakpoints between human and mouse genomes reported that 53% of the breakpoints were associated with segmental duplications, as compared to 18% expected in a random assignment of breaks

  • In Section Genome dedoubling by DCJ, we study the problem under the DCJ model, on multichromosomal unichromosomal genomes

Read more

Summary

Results

We first study the Genome Dedoubling Problem under the DCJ model. we study the problem under the reversal model on oriented genomes described in the Hannenhalli-Pevzner (HP) theory on sorting by reversal [12,13,14]. Let Ci be the maximum size of a subset of non-duplicated pairwise independent cycles in (G). The DCJ dedoubling distance of G is ddcj(G) = n – Ci. For example, in Fig. 1, the maximum size of a subset of non-duplicated pairwise independent cycles is 2 as there are three cycles, and the two rightmost cycles intersect. 1. The maximum size Ci of a set of non-duplicated pairwise independent cycles in the graph (G) is n. 2. If G is dedoubled genome, (G) contains n non-duplicated pairwise independent cycles, each containing a single couple of paralogous markers, plus possibly other cycles. If G is dedoubled genome, (G) contains n non-duplicated pairwise independent cycles, each containing a single couple of paralogous markers, plus possibly other cycles

Introduction
Methods
A DCJ operation can only alter the maximum size
Conclusion
18. Hochbaum DS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call