Spectral analysis of phylogenetic data

Michael D Hendy,David Penny

doi:10.1007/bf02638451

Abstract

The spectral analysis of sequence and distance data is a new approach to phylogenetic analysis. For two-state character sequences, the character values at a given site split the set of taxa into two subsets, a bipartition of the taxa set. The vector which counts the relative numbers of each of these bipartitions over all sites is called a sequence spectrum. Applying a transformation called a Hadamard conjugation, the sequence spectrum is transformed to the conjugate spectrum. This conjugation corrects for unobserved changes in the data, independently from the choice of phylogenetic tree. For any given phylogenetic tree with edge weights (probabilities of state change), we define a corresponding tree spectrum. The selection of a weighted phylogenetic tree from the given sequence data is made by matching the conjugate spectrum with a tree spectrum. We develop an optimality selection procedure using a least squares best fit, to find the phylogenetic tree whose tree spectrum most closely matches the conjugate spectrum. An inferred sequence spectrum can be derived from the selected tree spectrum using the inverse Hadamard conjugation to allow a comparison with the original sequence spectrum. A possible adaptation for the analysis of four-state character sequences with unequal frequencies is considered. A corresponding spectral analysis for distance data is also introduced. These analyses are illustrated with biological examples for both distance and sequence data. Spectral analysis using the Fast Hadamard transform allows optimal trees to be found for at least 20 taxa and perhaps for up to 30 taxa. The development presented here is self contained, although some mathematical proofs available elsewhere have been omitted. The analysis of sequence data is based on methods reported earlier, but the terminology and the application to distance data are new.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spectral analysis of phylogenetic data

Abstract

Talk to us

Similar Papers

More From: Journal of Classification

Lead the way for us

Journal: Journal of Classification	Publication Date: Jan 1, 1993
Citations: 189

Similar Papers

Integrated Computer Analysis of Genomic Sequencing Data Based on ICGenomics Tool
Yuriy L Orlov ... Anatoly O Bragin
-
Yuriy L Orlov, et. al.Yuriy L Orlov ... Anatoly O Bragin
01 Jan 2020
01 Jan 2020

Real-Time Analysis and Visualization of Pathogen Sequence Data.
Richard A Neher ... Trevor Bedford
Journal of Clinical Microbiology | VOL. 56
Richard A Neher, et. al.Richard A Neher ... Trevor Bedford
25 Oct 2018
Journal of Clinical Microbiology | VOL. 56

FASTAptamer: A Bioinformatic Toolkit for High-throughput Sequence Analysis of Combinatorial Selections.
Khalid K Alam ... Donald H Burke
Molecular Therapy - Nucleic Acids | VOL. 4
Khalid K Alam, et. al.Khalid K Alam ... Donald H Burke
01 Jan 2015
Molecular Therapy - Nucleic Acids | VOL. 4

XplorSeq: a software environment for integrated management and phylogenetic analysis of metagenomic sequence data.
Daniel N Frank
BMC Bioinformatics | VOL. 9
Daniel N FrankDaniel N Frank
07 Oct 2008
BMC Bioinformatics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spectral analysis of phylogenetic data

Abstract

Talk to us

Similar Papers

More From: Journal of Classification