Multitable Methods for Microbiome Data Integration.

Kris Sankaran,Susan P Holmes

doi:10.3389/fgene.2019.00627

Kris Sankaran, Susan P Holmes

Open Access

https://doi.org/10.3389/fgene.2019.00627

Copy DOI

Abstract

The simultaneous study of multiple measurement types is a frequently encountered problem in practical data analysis. It is especially common in microbiome research, where several sources of data—for example, 16s-rRNA, metagenomic, metabolomic, or transcriptomic data–can be collected on the same physical samples. There has been a proliferation of proposals for analyzing such multitable microbiome data, as is often the case when new data sources become more readily available, facilitating inquiry into new types of scientific questions. However, stepping back from the rush for new methods for multitable analysis in the microbiome literature, it is worthwhile to recognize the broader landscape of multitable methods, as they have been relevant in problem domains ranging across economics, robotics, genomics, chemometrics, and neuroscience. In different contexts, these techniques are called data integration, multi-omic, and multitask methods, for example. Of course, there is no unique optimal algorithm to use across domains—different instances of the multitable problem possess specific structure or variation that are worth incorporating in methodology. Our purpose here is not to develop new algorithms, but rather to 1) distill relevant themes across different analysis approaches and 2) provide concrete workflows for approaching analysis, as a function of ultimate analysis goals and data characteristics (heterogeneity, dimensionality, sparsity). Towards the second goal, we have made code for all analysis and figures available online at https://github.com/krisrs1128/multitable_review.

Highlights

We find that the scores are not nearly as closely aligned as they are for Canonical Correlation Analysis (CCA), but that they are more strongly associated with variation in android fat mass, as in the concatenated Principal Components Analysis (PCA) result of Figure 1
We have studied the problem of multitable data analysis, reviewing both the algorithmic foundations and practical applications of various methods
We have described approaches that are usually confined to particular literature areas and highlighted certain similarities in the process—for example, Principal Component Analysis with Instrumental Variables (PCA-IV) and the graph-fused lasso were proposed in very different contexts, but have similar goals

Summary

Multitable Methods for Microbiome Data Integration

Reviewed by: Kui Zhang, Michigan Technological University, United States Jing Ma, Fred Hutchinson Cancer Research Center, United States. The legend for scores from species abundances are colored by family, while those for the body composition associates samples with android fat mass. The association between these variables and the sample positions is not as strong as when performing PCA on the combined table This is to be expected, as PCA maximizes variance without any thought to covariance, and the body composition table alone has a large portion of its variance related to android fat mass. V = 1, FIGURE 2 | The Canonical Correlation Analysis (CCA) analog of the PCA biplot, obtained by applying CCA to the combined body composition and microbial abundance data. The reasoning behind the relative values of these two tuning parameters is that sparsity in species loadings is more important than sparsity across body composition variables, because the microbiome data are more high-dimensional. This seems to be the case even in the parallel-lasso context, where such structure has not been directly imposed

DISCUSSION

Findings

Methods

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Genetics	Publication Date: Aug 28, 2019
Citations: 22	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multitable Methods for Microbiome Data Integration.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

Bookreview of principles of data integration
Martin Telefont
Frontiers in Neuroinformatics | VOL. 7
Martin TelefontMartin Telefont
01 Jan 2013
Frontiers in Neuroinformatics | VOL. 7

Challenges and opportunities in sharing microbiome data and analyses.
Curtis Huttenhower ... Alice Carolyn Mchardy
Nature Microbiology | VOL. 8
Curtis Huttenhower, et. al.Curtis Huttenhower ... Alice Carolyn Mchardy
02 Oct 2023
Nature Microbiology | VOL. 8

УПРАВЛІННЯ ТА ІНТЕГРАЦІЯ ДАНИХ В УМОВАХ ЦИФРОВІЗАЦІЇ ЕКОНОМІЧНИХ ПРОЦЕСІВ: ВИКЛИКИ ТА ПЕРСПЕКТИВИ
Nataliia Kasyanova ... Vladyslav Okhrimenko
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Vladyslav Okhrimenko
01 Jan 2023
Economical | VOL. 1

DATA MANAGEMENT AND INTEGRATION IN THE CONTEXT OF DIGITALIZATION OF ECONOMIC PROCESSES: CHALLENGES AND PROSPECTS
Nataliia Kasyanova ... Serhii Koverha
Economical | VOL. 1
Nataliia Kasyanova, et. al.Nataliia Kasyanova ... Serhii Koverha
01 Jan 2023
Economical | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multitable Methods for Microbiome Data Integration.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics