Previous studies on LC–MS metabolomic profiling of 127 F2 Barbarea vulgaris plants derived from a cross of parental glabrous (G) and pubescent (P) type, revealed four triterpenoid saponins (hederagenin cellobioside, oleanolic acid cellobioside, epihederagenin cellobioside, and gypsogenin cellobioside) that correlated with resistance of plants against the insect herbivore, Phyllotreta nemorum. In this study, for the first time, we demonstrate the efficiency of the multi-way decomposition method PARAllel FACtor analysis 2 (PARAFAC2) for exploring complex LC–MS data. PARAFAC2 enabled automated resolution and quantification of several elusive chromatographic peaks (e.g. overlapped, elution time shifted and low s/n ratio), which could not be detected and quantified by conventional chromatographic data analysis. Raw LC–MS data of 127 F2 B. vulgaris plants were arranged in a three-way array (elution time point×mass spectra×samples), divided into 17 different chromatographic intervals and each interval were individually modeled by PARAFAC2. Three main outputs of the PARAFAC2 models described: (1) elution time profile, (2) relative abundance, and (3) pure mass spectra of the resolved peaks modeled from each interval of the chromatographic data. PARAFAC2 scores corresponding to relative abundances of the resolved peaks were extracted and further used for correlation and partial least squares (PLS) analysis. A total of 71 PARAFAC2 components (which correspond to actual peaks, baselines and tails of neighboring peaks) were modeled from 17 different chromatographic retention time intervals of the LC–MS data. In addition to four previously known saponins, correlation- and PLS-analysis resolved five unknown saponin-like compounds that were significantly correlated with insect resistance. The method also enabled a good separation between resistant and susceptible F2 plants. PARAFAC2 spectral loadings corresponding to the pure mass spectra of chromatographic peaks matched well with experimentally recorded mass spectra (correlation based similarity >95%). This enabled to extract pure mass spectra of highly overlapped and low s/n ratio peaks.
Read full abstract