An algorithmic view of gene teams

Marie-Pierre Béal,Anne Bergeron,Sylvie Corteel,Mathieu Raffinot

doi:10.1016/j.tcs.2004.02.036

Abstract

Comparative genomics is a growing field in computational biology, and one of its typical problem is the identification of sets of orthologous genes that have virtually the same function in several genomes. Many different bioinformatics approaches have been proposed to define these groups, often based on the detection of sets of genes that are “not too far” in all genomes. In this paper, we propose a unifying concept, called gene teams, which can be adapted to various notions of distance. We present two algorithms for identifying gene teams formed by n genes placed on m linear chromosomes. The first one runs in O(mn log 2 n) and uses a divide and conquer approach based on the formal properties of gene teams. We next propose an optimization of the original algorithm, and, in order to better understand the complexity bound of the algorithms, we recast the problem in the Hopcroft's partition refinement framework. This allows us to analyze the complexity of the algorithms with elegant amortized techniques. Both algorithms require linear space. We also discuss extensions to circular chromosomes that achieve the same complexity.

Full Text