Abstract

BackgroundAccurately estimating the timing and mode of gene duplications along the evolutionary history of species can provide invaluable information about underlying mechanisms by which the genomes of organisms evolved and the genes with novel functions arose. Mechanistic models have previously been introduced that allow for probabilistic inference of the evolutionary mechanism for duplicate gene retention based upon the average rate of loss over time of the duplicate. However, there is currently no probabilistic model embedded in a birth-death modeling framework that can take into account the effects of different evolutionary mechanisms of gene retention when analyzing gene family data.ResultsIn this study, we describe a generalized birth-death process for modeling the fates of gene duplication. Use of mechanistic models in a phylogenetic framework requires an age-dependent birth-death process. Starting with a single population corresponding to the lineage of a phylogenetic tree and with an assumption of a clock that starts ticking for each duplicate at its birth, an age-dependent birth-death process is developed by extending the results from the time-dependent birth-death process. The implementation of such models in a full phylogenetic framework is expected to enable large scale probabilistic analysis of duplicates in comparative genomic studies.ConclusionsWe develop an age-dependent birth-death model for understanding the mechanisms of gene retention, which allows a gene loss rate dependent on each duplication event. Simulation results indicate that different mechanisms of gene retentions produce distinct likelihood functions, which can be used with genomic data to quantitatively distinguish those mechanisms.

Highlights

  • Estimating the timing and mode of gene duplications along the evolutionary history of species can provide invaluable information about underlying mechanisms by which the genomes of organisms evolved and the genes with novel functions arose

  • There is no probabilistic model embedded in a birthdeath phylogenetic modeling framework that can take into account the effects of different evolutionary mechanisms of gene retention when analyzing gene family data

  • The process of gene duplication and loss occurs under the following postulates [54]: (1) the probability that a duplication will occur during an infinitesimal interval (t, t + Δt] is ntλΔt + o(Δt), while the probability that no duplication will occur is 1- ntλΔt + o(Δt), and (2) the probability that a gene duplicate will be lost during an infinitesimal interval (t, t + Δt] is ntμtΔt + o(Δt), while the probability that no loss will occur is 1- ntμt Δt + o(Δt), in which the loss rate μt is a function of time t

Read more

Summary

Introduction

Estimating the timing and mode of gene duplications along the evolutionary history of species can provide invaluable information about underlying mechanisms by which the genomes of organisms evolved and the genes with novel functions arose. A gene family is a group of genes with similar sequences that show evidence of descent from a common ancestor [1,2,3]. This includes orthologs that originate through speciation as well as duplicates (modeled here) that can be found within a species or shared between species from an older duplication event that predated speciation. Nonfunctionalization refers to the process in which mutations occur on one of the gene duplicates and produce a non-functional protein [11, 15]. A duplicate is most likely to be pseudogenized by degenerative mutation (nonfunctionalization) or lost due to population dynamics [9], the redundant copy may

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call