Chapter 9 - Data analysis in metabolomics: from information to knowledge

Jacopo Troisi,Giovanni Troisi,Giovanni Scala,Sean M Richards

doi:10.1016/b978-0-323-85062-9.00009-x

Jacopo Troisi, Giovanni Troisi + Show 2 more

https://doi.org/10.1016/b978-0-323-85062-9.00009-x

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Data analysis is a pivotal step in the metabolomics investigation pipeline as it allows to extract knowledge from datasets and get new insights and new perspectives as well as generate new hypotheses about biological complex processes. Data analysis uses various strategies including exploratory analysis, which evaluates metabolites individually and machine learning algorithms which in turn can be unsupervised (data driven) or supervised (task driven) allowing the study of the interplay of the several investigated metabolites. Among the first, particularly important in metabolomics are the clustering algorithms and the principal components analysis. Among the latter, a prominent role is played by partial least square discriminant analysis, artificial neural networks, and supported vectors machines. These algorithms as well as solo can be combined in ensemble models. Particular emphasis is placed in this chapter on various strategies for evaluating the effectiveness of machine learning algorithms trainings, including validation strategies, features selection mechanisms, and hyperparameter optimization.

Full Text