An LC-MS-based lipidomics pre-processing framework underpins rapid hypothesis generation towards CHO systems biotechnology.

Hock Chuan Yeo,Dong-Yup Lee,Shuwen Chen,Ying Swan Ho

doi:10.1007/s11306-018-1394-0

Abstract

Given a raw LC-MS dataset, it is often required to rapidly generate initial hypotheses, in conjunction with other 'omics' datasets, without time-consuming lipid verifications. Furthermore, for meta-analysis of many datasets, it may be impractical to conduct exhaustive confirmatory analyses. In other cases, samples for validation may be difficult to obtain, replicate or maintain. Thus, it is critical that the computational identification of lipids is of appropriate accuracy, coverage, and unbiased by a researcher's experience and prior knowledge. We aim to prescribe a systematic framework for lipid identifications, without usage of their characteristic retention-time by fully exploiting their underlying mass features. Initially, a hybrid technique, for deducing both common and distinctive daughter ions, is used to infer parent lipids from deconvoluted spectra. This is followed by parent confirmation using basic knowledge of their preferred product ions. Using the framework, we could achieve an accuracy of ~ 80% by correctly identified 101 species from 18 classes in Chinese hamster ovary (CHO) cells. The resulting inferences could explain the recombinant-producing capability of CHO-SH87 cells, compared to non-producing CHO-K1 cells. For comparison, a XCMS-based study of the same dataset, guided by a user's ad-hoc knowledge, identified less than 60 species of 12 classes from thousands of possibilities. We describe a systematic LC-MS-based framework that identifies lipids for rapid hypothesis generation.

Full Text