Bioinformatic tools are required to carry out essential functions such as statistical analyses and database functionalities. Now, they are also needed for one of the most difficult tasks, helping researchers decide which metabolites are the most biologically meaningful. This can be achieved through aiding the identification process, reducing feature redundancy, putting forward better candidates for tandem mass spectrometry (MS/MS), speeding up or automating the workflow, deconvolving the feature list through meta-analysis or multigroup analysis, or using stable isotopes and pathway mapping. This review thus focuses on the most recent and innovative bioinformatic advancements for identifying metabolites. A primary objective of metabolomics beyond biomarker discovery is to identify the most meaningful metabolites that correlate with disease pathogenesis or other perturbations of metabolism. Metabolites play important roles in biological pathways; their flux or differential regulation (dysregulation) can reveal novel insights into disease and environmental influences. Therefore, one of the most important goals of metabolomic analysis has been to assign metabolite identity so they can be used for further statistical and informed pathway analysis.1,2 Over the past few years, technologies for analyzing metabolites by untargeted or targeted metabolomics have undergone extensive improvements. Strides to establish the most efficient protocols for experimental design, sample extraction techniques, and data acquisition have paid off providing robust complex data sets.3−9 As more is being required of these data sets such as assigning identity and biological meaning to the features, bioinformatics is the area of metabolomics which is currently undergoing the most needed growth. It is often the case that metabolomic analysis results in a list of metabolites with low specificity for the disease or stimulus being studied (Figure (Figure1).1). Some of these metabolites seem to be dysregulated in a variety of diseases such as acylcarnitines10−13 and fatty acids.14−17 They may be more indicative of a perturbed systemic cause (appetite, physical activity, diurnal rhythm changes, etc..), sample contamination, or instrumental/bioinformatic noise, rather than a specific biomarker of disease. An example of this can be seen in the analysis of urinary biomarkers of ionizing radiation, where dicarboxylic acids were downregulated in the rat after radiation exposure. It was proven that this observation was actually caused by a decreased appetite after radiation exposure perturbing the β-oxidation pathway and not from radiation-induced cellular changes.18,19 Furthermore, dicarboxylic acids can leach out from plastics during the extraction process, further adding to the ambiguity of their role in ionizing radiation.20 Figure 1 Biomarkers that have high vs low disease specificity. As well as identifying the correct source of the biomarkers, it is also important to identify their physiological role and how to utilize them as therapeutic targets. This first has to start with the identification of the metabolite and is determined by filtering thresholds set by the user which is intrinsically biased. These thresholds include those for fold change and p-value, which are highly dependent on the experiment; in vitro experiments would exhibit lower variation between biological replicates than in vivo. The ease of identifying the metabolite is also determined by its concentration in the sample and previous annotation in metabolite databases. Filtering thresholds for metabolite intensity that are set too high may omit important biologically meaningful metabolites rather than noise. Furthermore, a metabolite that is novel or not curated in a database may not be taken into consideration based on the chemical knowledge of the researcher and what they deem as meaningful. In order to transform the complex list of identified metabolites into markers of disease, or assign what role they play, bioinformatic tools can aid in identifying the potential pathways that the metabolite may belong to. It is then that the researcher can use this knowledge surrounding the biology of the metabolite to probe the mechanism of the disease. Untargeted metabolomics has already been used in such a manner to find the source of neuropathic pain.21N,N-Dimethylsphingosine was dysregulated in a rat model of neuropathic pain, furthermore when dosed to control rats it induced mechanical hypersensitivity. This metabolite implicated the sphingomyelin-ceramide pathway as a potential therapeutic target. Antimetabolite inhibitors of enzymes in this pathway were tested and were able to ameliorate neuropathic pain (unpublished data). This study holds promise for other metabolomic studies to maximize the potential information contained within the data for finding therapeutics of disease rather than only providing lists of dysregulated metabolites.
Read full abstract