Abstract

BackgroundThe shape of phylogenetic trees has been used to make inferences about the evolutionary process by comparing the shapes of actual phylogenies with those expected under simple models of the speciation process. Previous studies have focused on speciation events, but gene duplication is another lineage splitting event, analogous to speciation, and gene loss or deletion is analogous to extinction. Measures of the shape of gene family phylogenies can thus be used to investigate the processes of gene duplication and loss. We make the first systematic attempt to use tree shape to study gene duplication using human gene phylogenies.ResultsWe find that gene duplication has produced gene family trees significantly less balanced than expected from a simple model of the process, and less balanced than species phylogenies: the opposite to what might be expected under the 2R hypothesis.ConclusionWhile other explanations are plausible, we suggest that the greater imbalance of gene family trees than species trees is due to the prevalence of tandem duplications over regional duplications during the evolution of the human genome.

Highlights

  • The shape of phylogenetic trees has been used to make inferences about the evolutionary process by comparing the shapes of actual phylogenies with those expected under simple models of the speciation process

  • Gene family trees are more unbalanced than expected under the Equal-Rate Markov model (ERM) model but substantially more balanced than expected under the proportional-todistinguishable arrangements (PDA) model

  • This can be confirmed for the ERM model because the individual pIm scores can be combined using Fisher's method to yield an overall p-value that the trees have been drawn from an ERM distribution [[27] pp.794–797], which is significantly rejected for our data (χ2 = 1693.9, df = 1430, P < 0.0001)

Read more

Summary

Introduction

The shape of phylogenetic trees has been used to make inferences about the evolutionary process by comparing the shapes of actual phylogenies with those expected under simple models of the speciation process. Molecular phylogenies for gene families (e.g. figure 1a) usually display sequences for different orthologous groups of proteins [1] from one or more species These trees can show a complicated tapestry of orthology and paralogy, and nodes on such trees may represent either gene duplications or speciations (figure 1, [2]): both are splitting events, producing daughter lineages that have independent evolutionary histories (at least in the absence of gene conversion or introgression [3]). Gene sequences from a completely sequenced (page number not for citation purposes)

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.