Abstract

Genome Rearrangement (GR) is a field of computational biology that uses conserved regions within two genomes as a source of information for comparison purposes. This branch of genomics uses the order in which these regions appear to infer evolutive scenarios and to compute distances between species, while usually neglecting non-conserved DNA sequence. This paper sheds light on this matter and proposes models that use both conserved and non-conserved sequences as a source of information. The questions that arise are how classic GR algorithms should be adapted and how much would we pay in terms of complexity to have this feature. Advances on these questions aid in measuring advantages of including such approach in GR algorithms. We propose to represent non-conserved regions by their lengths and apply this idea in a genome rearrangement problem called “Sorting by Block-Interchanges”. The problem is an interesting choice on the theory of computation viewpoint because it is one of the few problems that are solvable in polynomial time and whose algorithm has a small number of steps. That said, we present a 2-approximation algorithm to this problem along with data structures and formal definitions that may be generalized to other problems in GR field considering intergenic regions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call