Abstract

IntroductionData processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.ObjectivesMerge in the same platform the steps required for metabolomics data processing.MethodsKniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.ResultsThe approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.ConclusionKniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.

Highlights

  • Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow

  • Among the several analytical techniques employed within metabolomics, gas and liquid chromatography coupled with mass spectrometry (GC– and LC–MS) are the most commonly used in metabolomics studies as they allow the identification of a large number of diverse molecular species

  • An appropriate evaluation of the reasons behind the presence of missing values in the data matrix, and their consecutive imputation, is fundamental to avoid biased statistical results (Di Guida et al 2016; Gromski et al 2014). In this application, missing values imputation can be performed with either Random Forest (RF) or K-Nearest Neighbour (KNN) algorithms, implemented as R scripts using the libraries missForest (Stekhoven and Buhlmann 2012) and impute (Hastie et al 2016) respectively, or Small Value replacement (SV), i.e. half of the minimum value found for a given feature in given sample

Read more

Summary

Introduction

Among the several analytical techniques employed within metabolomics, gas and liquid chromatography coupled with mass spectrometry (GC– and LC–MS) are the most commonly used in metabolomics studies as they allow the identification of a large number of diverse molecular species. The plethora of samples analyzed during high-throughput screenings, the number of processing steps, and the required computational competences and resources often represent a bottleneck that renders these analyses slow. For these reasons, the KNIME Analytics Platform (Berthold et al 2007) was used to build a vendor-independent processing workflow. KniMet (Liggi 2017) joins several steps required to process GC– and LC–MS metabolomics data, outputting a data matrix normalized, annotated and filtered from inconsistently detected features in a semi-automated, documented and reproducible analysis

KniMet features
Data deconvolution
Feature filtering
Metabolite annotation
Normalization
Conclusions
Compliance with ethical standards
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.