Abstract

Computer-based technological innovation provides advancements in sophisticated and diverse analytical instruments, enabling massive amounts of data collection with relative ease. This is accompanied by a fast-growing demand for technological progress in data mining methods for analysis of big data derived from chemical and biological systems. From this perspective, use of a general “linear” multivariate analysis alone limits interpretations due to “non-linear” variations in metabolic data from living organisms. Here we describe a kernel principal component analysis (KPCA)-incorporated analytical approach for extracting useful information from metabolic profiling data. To overcome the limitation of important variable (metabolite) determinations, we incorporated a random forest conditional variable importance measure into our KPCA-based analytical approach to demonstrate the relative importance of metabolites. Using a market basket analysis, hippurate, the most important variable detected in the importance measure, was associated with high levels of some vitamins and minerals present in foods eaten the previous day, suggesting a relationship between increased hippurate and intake of a wide variety of vegetables and fruits. Therefore, the KPCA-incorporated analytical approach described herein enabled us to capture input–output responses, and should be useful not only for metabolic profiling but also for profiling in other areas of biological and environmental systems.

Highlights

  • Innovation in computer-based technology has caused not just advancements of computer-associated technology and considerably contributions of their ripple effects to technological progress in research fields of chemistry and biology

  • Urinary metabolic and elemental data obtained from nuclear magnetic resonance (NMR) and inductively coupled plasma optical emission spectrometry (ICP-OES), respectively (Figure S1), were integrated on a data matrix prior to kernel principal component analysis (PCA) (KPCA)

  • KPCA was performed using the analysis of variance (ANOVA) kernel function after changing the sigma parameter from 0.05 to 0.3 with the degree parameter d = 1 (Figure S2), 2, and 3

Read more

Summary

Introduction

Innovation in computer-based technology has caused not just advancements of computer-associated technology and considerably contributions of their ripple effects to technological progress in research fields of chemistry and biology. NMR-based metabolic profiling has been applied to various biological and environmental samples[6,7,8,9,10,11,12,13] These types of research benefit from several useful and helpful databases and analytical support tools for preprocessing of spectral data and assignments of metabolites in complex chemical mixtures in NMR-based metabolic profiling. We describe a KPCA-incorporated analytical approach for the extraction of useful information from NMR-based metabolic profiling datasets. The obtained importance was validated using statistical tests and further analyzed using a market basket analysis (MBA)[33] to evaluate input–output responses (urinary metabolites and minerals associated with dietary food and nutritional information) in humans (Fig. 1)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.