Abstract

BackgroundThe introduction of the double cut and join operation (DCJ) caused a flurry of research into the study of multichromosomal rearrangements. However, little of this work has incorporated indels (i.e., insertions and deletions of chromosomes and chromosomal intervals) into the calculation of genomic distance functions, with the exception of Braga et al., who provided a linear time algorithm for the problem of DCJ-indel sorting. Although their algorithm only takes linear time, its derivation is lengthy and depends on a large number of possible cases.ResultsWe note the simple idea that a deletion of a chromosomal interval can be viewed as a DCJ that creates a new circular chromosome. This framework will allow us to amortize indels as DCJs, which in turn permits the application of the classical breakpoint graph to obtain a simplified indel model that still solves the problem of DCJ-indel sorting in linear time via a more concise formulation that relies on the simpler problem of DCJ sorting. Furthermore, we can extend this result to fully characterize the solution space of DCJ-indel sorting.ConclusionsEncoding indels as DCJ operations offers a new insight into why the problem of DCJ-indel sorting is not ultimately any more difficult than that of sorting by DCJs alone. There is still room for research in this area, most notably the problem of sorting when the cost of indels is allowed to vary with respect to the cost of a DCJ and we demand a minimum cost transformation of one genome into another.

Highlights

  • Preliminaries Say that we are given a perfect matching on 2N labeled vertices V, forming a set G of N edges called genes; the vertices of each gene form its head and tail

  • In light of Theorem 3, we have reduced double cut and join operation (DCJ)-indel sorting to the problem of constructing indels intelligently to maximize a weighted sum of breakpoint graph components

  • We still do not see a natural correspondence between the two approaches to DCJ-indel sorting, which appear to be at odds because their definitions of indels are equivalent but motivated differently

Read more

Summary

Introduction

Preliminaries Say that we are given a perfect matching on 2N labeled vertices V , forming a set G of N edges called genes; the vertices of each gene form its head and tail. Little of this work has incorporated indels (i.e., insertions and deletions of chromosomes and chromosomal intervals) into the calculation of genomic distance functions, with the exception of Braga et al, who provided a linear time algorithm for the problem of DCJ-indel sorting. More recent research has moved past permutations and toward multichromosomal genomic models that incorporate both linear and circular chromosomes One of these models, which we will study in this paper, models the chromosomes of a genome with paths and cycles in a graph. For this model, the double cut and join operation (DCJ) was introduced in [8] and incorporates segment reversals with a number of other operations. A linear time greedy algorithm exists for DCJ sorting two genomes having equal gene content (see [9])

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.