Abstract

A novel result of the current research is the development and implementation of a unique functional phylogenomic approach that explores the genomic origins of seed plant diversification. We first use 22,833 sets of orthologs from the nuclear genomes of 101 genera across land plants to reconstruct their phylogenetic relationships. One of the more salient results is the resolution of some enigmatic relationships in seed plant phylogeny, such as the placement of Gnetales as sister to the rest of the gymnosperms. In using this novel phylogenomic approach, we were also able to identify overrepresented functional gene ontology categories in genes that provide positive branch support for major nodes prompting new hypotheses for genes associated with the diversification of angiosperms. For example, RNA interference (RNAi) has played a significant role in the divergence of monocots from other angiosperms, which has experimental support in Arabidopsis and rice. This analysis also implied that the second largest subunit of RNA polymerase IV and V (NRPD2) played a prominent role in the divergence of gymnosperms. This hypothesis is supported by the lack of 24nt siRNA in conifers, the maternal control of small RNA in the seeds of flowering plants, and the emergence of double fertilization in angiosperms. Our approach takes advantage of genomic data to define orthologs, reconstruct relationships, and narrow down candidate genes involved in plant evolution within a phylogenomic view of species' diversification.

Highlights

  • Attempts to clearly resolve the relationships among major seed plant groups using nuclear gene sequences have been hampered by the small number of completely sequenced genomes, the scarcity of ESTs for certain plant groups, and the lack of automated tools that can assemble and analyze large phylogenomic data sets

  • We found that genes involved in post-transcriptional silencing via RNA interference (RNAi)—increasingly important in understanding plant evolution—are significantly represented early in angiosperm and gymnosperm divergence, with an apparent loss of specific classes of small interfering RNAs in gymnosperms

  • To reduce the size of the dataset for maximum likelihood (ML) analysis, and to remove partitions with the most missing data, we constructed a matrix by only including genes with at least 30% representation across all genera

Read more

Summary

Introduction

Attempts to clearly resolve the relationships among major seed plant groups using nuclear gene sequences have been hampered by the small number of completely sequenced genomes, the scarcity of ESTs for certain plant groups, and the lack of automated tools that can assemble and analyze large phylogenomic data sets. Plant molecular phylogenetics has long relied on plastid genomes and only a few nuclear markers to infer relationships [1,3,4,5,6,7,8,9]. We first identify sets of orthologs from genomic sequences using a phylogenetic context [21] We use these orthologs to construct a total-evidence phylogeny and examine the distribution of their support metrics per node [20]. The main premise of this approach is that genes (partitions) that are in agreement or in Author Summary

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call