Abstract

Comparative genomics is a growing field in computational biology, and one of its typical problem is the identification of sets of orthologous genes that have virtually the same function in several genomes. Many different bioinformatics approaches have been proposed to define these groups, often based on the detection of sets of genes that are “not too far” in all genomes. In this paper, we propose a unifying concept, called gene teams , which can be adapted to various notions of distance. We present two algorithms for identifying gene teams formed by n genes placed on m linear chromosomes. The first one runs in O(mn log 2 n) and uses a divide and conquer approach based on the formal properties of gene teams. We next propose an optimization of the original algorithm, and, in order to better understand the complexity bound of the algorithms, we recast the problem in the Hopcroft's partition refinement framework. This allows us to analyze the complexity of the algorithms with elegant amortized techniques. Both algorithms require linear space. We also discuss extensions to circular chromosomes that achieve the same complexity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.