On the family-free DCJ distance and similarity.

Fábio V Martinez,Jens Stoye,Marília Dv Braga,Pedro Feijão

doi:10.1186/s13015-015-0041-9

Abstract

Structural variation in genomes can be revealed by many (dis)similarity measures. Rearrangement operations, such as the so called double-cut-and-join (DCJ), are large-scale mutations that can create complex changes and produce such variations in genomes. A basic task in comparative genomics is to find the rearrangement distance between two given genomes, i.e., the minimum number of rearragement operations that transform one given genome into another one. In a family-based setting, genes are grouped into gene families and efficient algorithms have already been presented to compute the DCJ distance between two given genomes. In this work we propose the problem of computing the DCJ distance of two given genomes without prior gene family assignment, directly using the pairwise similarities between genes. We prove that this new family-free DCJ distance problem is APX-hard and provide an integer linear program to its solution. We also study a family-free DCJ similarity and prove that its computation is NP-hard.

Highlights

Genomes are subject to mutations or rearrangements in the course of evolution
To be more consistent with the comparative genomics literature, where distance measures are more common than similarities, here we propose a family-free DCJ distance
We propose an integer linear program (ILP) formulation to compute the family-free DCJ distance between two given genomes

Summary

Introduction

Genomes are subject to mutations or rearrangements in the course of evolution. Typical large-scale rearrangements change the number of chromosomes and/or the positions and orientations of genes. Examples of such rearrangements are inversions, translocations, fusions and fissions. A classical problem in comparative genomics is to compute the rearrangement distance, that is, the minimum number of rearrangements required to transform a given genome into another given genome [1]. In order to study this problem, one usually adopts a high-level view of genomes, in which only “relevant” fragments of the DNA (e.g., genes) are taken into consideration. A pre-processing of the data is required, so that we can compare the content of the genomes

Objectives

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms for Molecular Biology	Publication Date: Apr 1, 2015
Citations: 37	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

On the family-free DCJ distance and similarity.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology

Lead the way for us

Similar Papers

On the Family-Free DCJ Distance
Fábio V Martinez ... Marília D V Braga
-
Fábio V Martinez, et. al.Fábio V Martinez ... Marília D V Braga
01 Jan 2014
01 Jan 2014

Approximating the DCJ distance of balanced genomes in linear time
Diego P. Rubert ... Fábio Henrique Viduani Martinez
Algorithms for Molecular Biology | VOL. 12
Diego P. Rubert, et. al.Diego P. Rubert ... Fábio Henrique Viduani Martinez
09 Mar 2017
Algorithms for Molecular Biology | VOL. 12

A Linear Time Approximation Algorithm for the DCJ Distance for Genomes with Bounded Number of Duplicates
Diego P Rubert ...
-
Diego P Rubert, et. al.Diego P Rubert ...
01 Jan 2015
01 Jan 2015

Genomic structural variations lead to dysregulation of important coding and non-coding RNA species in dilated cardiomyopathy.
Jan Haas ...
EMBO molecular medicine | VOL. 10
Jan Haas, et. al.Jan Haas ...
14 Nov 2017
EMBO molecular medicine | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the family-free DCJ distance and similarity.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms for Molecular Biology