Background & ObjectiveThe integration of genome wide association studies (GWAS) with metabolomics, termed mGWAS, offers a tremendous opportunity to gain insights into the genetic control of metabolism. A current bottleneck of mGWAS is the biological interpretation of the large amount of generated data, which include associations between SNP‐annotated genes and metabolites. This project aimed at developing a robust bioinformatic package to quantitatively annotate gene‐metabolite association pairs through metabolic pathway mapping.Methods & ResultsA R package, PathQuant, has been developed following Bioconductor guidelines to ensure reproducibility of results and easy growth. The current version of PathQuant uses as input a list of gene‐metabolite associations pairs and enables: (i) gene classification into enzymatic vs. non‐enzymatic category using the Enzyme Commission number (EC) as annotation; (ii) mapping of metabolic gene‐metabolite pairs on a graph model of human KEGG metabolic pathway maps, where genes are edges and metabolites are nodes, and (iii) calculation of shortest reactional distances between gene‐metabolite pairs with either graphical visualization or textual tables as outputs. As a proof‐of concept, PathQuant was used to map mGWAS gene‐metabolite associations data from Shin et al. (2014) using all KEGG metabolism pathway maps, which include the metabolism reconstruction overview map and specific individual pathway maps. We applied the method for 86 reported associations between 50 enzymatic genes and 66 metabolites measured in plasma. When mapped to KEGG metabolism overview (Fig. 1), these associations are mostly found in “Energy” (purple), “Amino acids” (orange) and “Nucleotides” (green) pathway classes. PathQuant annotated finite numerical distances between 28 genes and 31 metabolites involved in 38 associations of which 36 had a short distance, between 0 and 5, which indicates that the reaction catalyzed by the gene encoded enzyme was not more than 5 reactions apart from that involving its associated metabolite. For 17 genes and 27 metabolites, representing 27 pairs, we were unable to calculate finite numerical distances using PathQuant, which is attributed to current limitations of human KEGG pathway maps, such as (i) missing enzymes annotated in humans creating disconnected subgraphs within the maps, (ii) the presence of a gene and a metabolite from a given pair on different maps and (iii) limited coverage of lipid metabolic diversity in KEGG pathway maps (Fig. 1). While improvement of the tool's capacity for annotation could address limitations (i) and (ii), there were, however, 4 genes and 12 metabolites (21 pairs) that were not present on any KEGG pathways.ConclusionPathQuant provides a high‐throughput approach to link and objectively annotate gene‐metabolite pairs. Future work aims at upgrading PathQuant by refining the annotation and improving coverage of pathway classes by including other pathway databases than KEGG as well as to expand the annotation to genes involved in cell signaling pathways.Support or Funding InformationGenome Canada, Genome Québec, Genome British Columbia, Agilent Technologies, CIHR, Crohn's and Colitis Canada, Government of Canada.
Read full abstract