Statistical hypothesis testing of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Hiroyuki Yamamoto,Gen Ishikawa,Yoshiaki Ohashi,Tamaki Fujimori,Hajime Sato,Kenjiro Kami

doi:10.1186/1471-2105-15-51

Abstract

BackgroundPrincipal component analysis (PCA) has been widely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace. In metabolomics, some metabolites (e.g., the top 10 metabolites) have been subjectively selected when using factor loading in PCA, and biological inferences are made for these metabolites. However, this approach may lead to biased biological inferences because these metabolites are not objectively selected with statistical criteria.ResultsWe propose a statistical procedure that selects metabolites with statistical hypothesis testing of the factor loading in PCA and makes biological inferences about these significant metabolites with a metabolite set enrichment analysis (MSEA). This procedure depends on the fact that the eigenvector in PCA for autoscaled data is proportional to the correlation coefficient between the PC score and each metabolite level. We applied this approach to two sets of metabolomic data from mouse liver samples: 136 of 282 metabolites in the first case study and 66 of 275 metabolites in the second case study were statistically significant. This result suggests that to set the number of metabolites before the analysis is inappropriate because the number of significant metabolites differs in each study when factor loading is used in PCA. Moreover, when an MSEA of these significant metabolites was performed, significant metabolic pathways were detected, which were acceptable in terms of previous biological knowledge.ConclusionsIt is essential to select metabolites statistically to make unbiased biological inferences from metabolomic data when using factor loading in PCA. We propose a statistical procedure to select metabolites with statistical hypothesis testing of the factor loading in PCA, and to draw biological inferences about these significant metabolites with MSEA. We have developed an R package “mseapca” to facilitate this approach. The “mseapca” package is publicly available at the CRAN website.

Highlights

Principal component analysis (PCA) has been widely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace
12 h-fasted mice We describe the use of the statistical hypothesis testing of factor loading in PCA using metabolome data from two studies
In metabolomics, the targeted metabolites from which biological inferences are drawn are selected subjectively when factor loading is used in PCA

Summary

Introduction

Principal component analysis (PCA) has been widely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace. Some metabolites (e.g., the top 10 metabolites) have been subjectively selected when using factor loading in PCA, and biological inferences are made for these metabolites. The statistical analysis of the analytical data obtained has been studied in chemometrics research [1]. PCA [2] is routinely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace. When a specific PC score is found to be related to a phenotype of interest [5,6], such as a time course or group information, the corresponding factor loading is evaluated to discern meaningful metabolites from which to draw biological inferences

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Feb 21, 2014
Citations: 128	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Statistical hypothesis testing of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data
J Xia ... D S Wishart
Nucleic Acids Research | VOL. 38
J Xia, et. al.J Xia ... D S Wishart
10 May 2010
Nucleic Acids Research | VOL. 38

Using MetaboAnalyst 4.0 for Comprehensive and Integrative Metabolomics Data Analysis.
Jasmine Chong ... Jianguo Xia
Current Protocols in Bioinformatics | VOL. 68
Jasmine Chong, et. al.Jasmine Chong ... Jianguo Xia
20 Sep 2019
Current Protocols in Bioinformatics | VOL. 68

OS-PCA: Orthogonal Smoothed Principal Component Analysis Applied to Metabolome Data.
Hiroyuki Yamamoto ... Hiroshi Tsugawa
Metabolites | VOL. 11
Hiroyuki Yamamoto, et. al.Hiroyuki Yamamoto ... Hiroshi Tsugawa
05 Mar 2021
Metabolites | VOL. 11

Using MetaboAnalyst 3.0 for Comprehensive Metabolomics Data Analysis.
Jianguo Xia ... David S Wishart
Current Protocols in Bioinformatics | VOL. 55
Jianguo Xia, et. al.Jianguo Xia ... David S Wishart
01 Sep 2016
Current Protocols in Bioinformatics | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Statistical hypothesis testing of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics