Leveraging genomic redundancy to improve inference and alignment of orthologous proteins.

Marc Singleton,Michael Eisen

doi:10.1093/g3journal/jkad222

Abstract

Identifying protein sequences with common ancestry is a core task in bioinformatics and evolutionary biology. However, methods for inferring and aligning such sequences in annotated genomes have not kept pace with the increasing scale and complexity of the available data. Thus, in this work, we implemented several improvements to the traditional methodology that more fully leverage the redundancy of closely related genomes and the organization of their annotations. Two highlights include the application of the more flexible k-clique percolation algorithm for identifying clusters of orthologous proteins and the development of a novel technique for removing poorly supported regions of alignments with a phylogenetic hidden Markov model (phylo-HMM). In making the latter, we wrote a fully documented Python package Homomorph that implements standard HMM algorithms and created a set of tutorials to promote its use by a wide audience. We applied the resulting pipeline to a set of 33 annotated Drosophila genomes, generating 22,813 orthologous groups and 8,566 high-quality alignments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Leveraging genomic redundancy to improve inference and alignment of orthologous proteins.

Abstract

Talk to us

Similar Papers

More From: G3 (Bethesda, Md.)

Lead the way for us

Journal: G3 (Bethesda, Md.)	Publication Date: Sep 28, 2023
License type: CC BY 4.0

Similar Papers

Editor's evaluation: The impact of local genomic properties on the evolutionary fate of genes
Wenfeng Qian
-
Wenfeng QianWenfeng Qian
07 Oct 2022
07 Oct 2022

Refining orthologue groups at the transcript level
Yizhen Jia ... Thomas Kf Wong
BMC Genomics | VOL. 11
Yizhen Jia, et. al.Yizhen Jia ... Thomas Kf Wong
01 Dec 2010
BMC Genomics | VOL. 11

Appendix 1B The Evolutionary History of Proteins Involved in Pre-replication Complex Assembly
...
-
, et. al. ...
01 Jan 2006
01 Jan 2006

Physical, Transcriptional and Comparative Mapping on the Human X Chromosome

-

19 Jun 2002
19 Jun 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging genomic redundancy to improve inference and alignment of orthologous proteins.

Abstract

Talk to us

Similar Papers

More From: G3 (Bethesda, Md.)