Abstract

Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes.

Highlights

  • Developing an understanding of plant metabolism is a central aim of plant research

  • As the goal of this study was to exploit phylogenetic profiling for metabolism pathway assignments of genes with a focus on secondary metabolism, we first inspected the presence of known secondary metabolism pathways across the 24 plant species with available Ensembl and KEGG information (Figure 1)

  • Based on this presence/absence call, about one third (10 out of all 31 secondary pathways) were found present in all 24 plant species. For those pathways, no differential presence/absence profile was evident rendering the application of phylogenetic profiling unspecific as a number of different secondary metabolism pathways exhibit the same presence profile

Read more

Summary

Introduction

Developing an understanding of plant metabolism is a central aim of plant research. The better we can assess the metabolic capacities of plants and how they regulate their metabolic activities, the better we can make use of the manifold of products and protect their fragile ecosystems. It should be possible to estimate a plant’s metabolic capacity based on the knowledge of all possible metabolic reactions that are in turn encoded by the repertoire of enzyme genes in the respective genome. Complete and accurate genome annotation is paramount for a comprehensive understanding of plant metabolism. Reliable functional gene annotation is neither trivial nor is our current knowledge of possible metabolic pathways complete. We are not yet able to check for the presence of “textbook pathways” by virtue of accurate gene annotation. In particular in the context of secondary metabolite pathways, are still being discovered, requiring, substantial experimental effort as demonstrated in the discovery of a strigolactone pathway in plants (Alder et al, 2012)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call