Abstract

XCMS and MZmine 2 are two widely used software packages for preprocessing untargeted LC/MS metabolomics data. Both construct extracted ion chromatograms (EICs) and detect peaks from the EICs, the first two steps in the data preprocessing workflow. While both packages have performed admirably in peak picking, they also detect a problematic number of false positive EIC peaks and can also fail to detect real EIC peaks. The former and latter translate downstream into spurious and missing compounds and present significant limitations with most existing software packages that preprocess untargeted mass spectrometry metabolomics data. We seek to understand the specific reasons why XCMS and MZmine 2 find the false positive EIC peaks that they do and in what ways they fail to detect real compounds. We investigate differences of EIC construction methods in XCMS and MZmine 2 and find several problems in the XCMS centWave peak detection algorithm which we show are partly responsible for the false positive and false negative compound identifications. In addition, we find a problem with MZmine 2's use of centWave. We hope that a detailed understanding of the XCMS and MZmine 2 algorithms will allow users to work with them more effectively and will also help with future algorithmic development.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call