A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins

Wei Cao,Lu-Yun Wu,Xia-Yu Xia,Xiang Chen,Zhi-Xin Wang,Xian-Ming Pan

doi:10.1038/s41598-023-47496-9

Wei Cao, Lu-Yun Wu + Show 4 more

Open Access

https://doi.org/10.1038/s41598-023-47496-9

Copy DOI

Journal: Scientific Reports	Publication Date: Nov 20, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Ministerio de Educación del Perú

Abstract

Because of the limited effectiveness of prevailing phylogenetic methods when applied to highly divergent protein sequences, the phylogenetic analysis problem remains challenging. Here, we propose a sequence-based evolutionary distance algorithm termed sequence distance (SD), which innovatively incorporates site-to-site correlation within protein sequences into the distance estimation. In protein superfamilies, SD can effectively distinguish evolutionary relationships both within and between protein families, producing phylogenetic trees that closely align with those based on structural information, even with sequence identity less than 20%. SD is highly correlated with the similarity of the protein structure, and can calculate evolutionary distances for thousands of protein pairs within seconds using a single CPU, which is significantly faster than most protein structure prediction methods that demand high computational resources and long run times. The development of SD will significantly advance phylogenetics, providing researchers with a more accurate and reliable tool for exploring evolutionary relationships.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins

Abstract

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Percent Sequence Identity: The Need to Be Explicit
Alex C.W May
Structure | VOL. 12
Alex C.W MayAlex C.W May
01 May 2004
Structure | VOL. 12

Heuristic Methods for Finding Pathogenic Variants in Gene Coding Sequences
Monique Ohanian ... Diane Fatkin
Journal of the American Heart Association | VOL. 1
Monique Ohanian, et. al.Monique Ohanian ... Diane Fatkin
26 Sep 2012
Journal of the American Heart Association | VOL. 1

Protein structure alignment considering phenotypic plasticity
Gergely Csaba ... Fabian Birzele
Bioinformatics | VOL. 24
Gergely Csaba, et. al.Gergely Csaba ... Fabian Birzele
09 Aug 2008
Bioinformatics | VOL. 24

Classification tree based protein structure distances for testing sequence–structure correlation
Elias Zintzaras
Computers in Biology and Medicine | VOL. 38
Elias ZintzarasElias Zintzaras
04 Mar 2008
Computers in Biology and Medicine | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins

Abstract

Talk to us

Similar Papers

More From: Scientific Reports