Abstract

BackgroundClassical approaches to compute the genomic distance are usually limited to genomes with the same content and take into consideration only rearrangements that change the organization of the genome (i.e. positions and orientation of pieces of DNA, number and type of chromosomes, etc.), such as inversions, translocations, fusions and fissions. These operations are generically represented by the double-cut and join (DCJ) operation. The distance between two genomes, in terms of number of DCJ operations, can be computed in linear time. In order to handle genomes with distinct contents, also insertions and deletions of fragments of DNA – named indels – must be allowed. More powerful than an indel is a substitution of a fragment of DNA by another fragment of DNA. Indels and substitutions are called content-modifying operations. It has been shown that both the DCJ-indel and the DCJ-substitution distances can also be computed in linear time, assuming that the same cost is assigned to any DCJ or content-modifying operation.ResultsIn the present study we extend the DCJ-indel and the DCJ-substitution models, considering that the content-modifying cost is distinct from and upper bounded by the DCJ cost, and show that the distance in both models can still be computed in linear time. Although the triangular inequality can be disrupted in both models, we also show how to efficiently fix this problem a posteriori.

Highlights

  • Classical approaches to compute the genomic distance are usually limited to genomes with the same content and take into consideration only rearrangements that change the organization of the genome, such as inversions, translocations, fusions and fissions

  • We refine the double-cut and join (DCJ)-indel [7] and the DCJ-substitution [8] models, by adopting a distinct content-modifying cost that is upper bounded by the DCJ cost

  • Results we show how to compute the DCJindel and the DCJ-substitution distances, considering that the content-modifying cost is distinct from and upper bounded by the DCJ cost

Read more

Summary

Results

We show how to compute the DCJindel and the DCJ-substitution distances, considering that the content-modifying cost is distinct from and upper bounded by the DCJ cost. Let dDsbCJ (P) be the DCJ-substitution distance of P, that is the minimum cost of a DCJ-substitution sequence of operations sorting P separately This is given by the following proposition. In the case of the DCJ-substitution distance, for genomes A and B and a positive constant k , let msb(A,B) = dDsbCJ (A,B) + k · u(A,B), where u(A,B) is the number of unique markers between A and B [7,12]. In order to find the minimum value of k for which the inequality of Proposition 6 holds, we need to determine the diameter of the DCJ-substitution distance, that is given by the following lemma.

Background
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.