Abstract

Given two genomes, the problem of sorting by reversals is to explain the evolution of these genomes from a common ancestor by a minimal sequence of reversals. The Hannenhalli and Pevzner (HP) algorithm [8] gives the reversal distance and outputs one possible sequence of reversals. However, there is usually a very large set of such minimal solutions. To really understand the mechanism of reversals, it is important to have access to that set of minimal solutions. We develop a new method that allows the user to choose one or several solutions, based on different criteria. In particular, it can be used to sort genomes by weighted reversals. This requires a characterization of all reversals, as defined in the HP theory. We describe a procedure that outputs the set of all safe reversals at each step of the sorting procedure in time O(n3), and we show how to characterize a large set of such reversals in a more efficient way. We also describe a linear algorithm allowing to generate a random genome of a given reversal distance. We use our methods to verify the hypothesis that, in bacteria, most reversals act on segments surrounding one of the two endpoints of the replication axis [12].

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call