Abstract

Fractionation is the genome-wide process of losing one gene per duplicate pair following whole genome doubling (WGD). An important type of evidence for duplicate gene loss is the frequency distribution of similarities between paralogous gene pairs in a genome or orthologous gene pairs in two species. We extend a previous branching process model for fractionation, originally accounting for paralog similarities, to encompass the distribution of ortholog similarities, after multiple rounds of whole genome doubling and fractionation, with the speciation event occurring at any point. We estimate the fractionation rates during all the inter-event periods in each lineage of the plant family Malvaceae. We suggest a major correction of the phylogenetic position of the durian sub-family, and discover a new triplication event in this lineage.

Highlights

  • THE evolutionary history of the flowering plants is punctuated with numerous whole genome doubling and tripling (WGD) events, a phenomenon that has only occasionally been identified in other phylogenetic domains

  • It is true that pairs of duplicate genes tend to lose one redundant member over time, a process called fractionation with some categories of genes more susceptible to loss than others [2], [3], and sometimes from one subgenome rather than the other [4], but the question remains of whether fractionation occurs at a rapid enough pace to counter the effect of recurrent WGD on gene number

  • We have developed a model for predicting the shape of these distributions based on the event times, the ploidy multiplicities of the events, rates of loss of duplicate genes from the genome, and rates of sequence divergence [5]

Read more

Summary

INTRODUCTION

THE evolutionary history of the flowering plants (angiosperms) is punctuated with numerous whole genome doubling and tripling (WGD) events, a phenomenon that has only occasionally been identified in other phylogenetic domains. We have developed a model for predicting the shape of these distributions based on the event times, the ploidy multiplicities of the events, rates of loss of duplicate genes from the genome (fractionation), and rates of sequence divergence [5] Underlying these predictions is a paralog tree generated by a discrete-time branching process with one biologically-motivated constraint, which is mathematically tractable and whose parameters are well suited to statistical inference. The mathematics of the branching process involving recurring WGD within a single genome, and the associated inferential tasks associated with paralogous gene pairs, have been worked out, but attempts to extend this to orthologs in two genomes post-speciation [5], [6] have involved treatments not fully consistent with the spirit of the single genome model.

THE GENERAL MODEL
THE MALVACEAE
INFERENCE ON THE DISTRIBUTION OF SIMILARITIES
RESULTS
DISCUSSION AND CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call