COBRA Preprint Series

Statistical hypothesis test of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Hiroyuki Yamamoto, Human Metabolome Technologies, Inc.Follow
Tamaki Fujimori, Human Metabolome Technologies, Inc.Follow
Hajime Sato, Human Metabolome Technologies, Inc.Follow
Gen Ishikawa, Human Metabolome Technologies, Inc.Follow
Kenjiro Kami, Human Metabolome Technologies, Inc.Follow
Yoshiaki Ohashi, Human Metabolome Technologies, Inc.Follow

Abstract

Principal component analysis (PCA) has been widely used to visualize high-dimensional metabolomic data in a two- or three-dimensional subspace. In metabolomics, some metabolites (e.g. top 10 metabolites) have been subjectively selected when using factor loading in PCA, and biological inferences for these metabolites are made. However, this approach is possible to lead biased biological inferences because these metabolites are not objectively selected by statistical criterion. We proposed a statistical procedure to pick up metabolites by statistical hypothesis test of factor loading in PCA and make biological inferences by metabolite set enrichment analysis (MSEA) for these significant metabolites. This procedure depends on the fact that the eigenvector in PCA for autoscaled data is proportional to the correlation coefficient between PC score and each metabolite levels. We applied this approach for two metabolomic data of mice liver samples. 136 of 282 metabolites in first case study and 66 of 275 metabolites in second case study were statistically significant. This result suggests that to set the previously-determined number of metabolites is not appropriate because the number of significant metabolites is different in each study when using factor loading in PCA. Moreover, MSEA was performed for these significant metabolites and significant metabolic pathways can be detected. These results are acceptable when compared with previous biological knowledge. It is essential to select metabolites statistically for making unbiased biological inferences from metabolome data, when using factor loading in PCA. We proposed a statistical procedure to pick up metabolites by statistical hypothesis test of factor loading in PCA and make biological inferences by MSEA for these significant metabolites. We developed an R package "mseapca" to perform this approach. The “mseapca” package is publicity available on CRAN website.

Disciplines

Biostatistics

Suggested Citation

Yamamoto, Hiroyuki; Fujimori, Tamaki; Sato, Hajime; Ishikawa, Gen; Kami, Kenjiro; and Ohashi, Yoshiaki, "Statistical hypothesis test of factor loading in principal component analysis and its application to metabolite set enrichment analysis" (October 2012). COBRA Preprint Series. Working Paper 99.
https://biostats.bepress.com/cobra/art99

Supplementary_Tables.xlsx (47 kB)

Download

Included in

Biostatistics Commons

COinS

Collection of Biostatistics Research Archive

COBRA Preprint Series

Statistical hypothesis test of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

Collection of Biostatistics Research Archive

COBRA Preprint Series

Statistical hypothesis test of factor loading in principal component analysis and its application to metabolite set enrichment analysis

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner