Liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) is commonly used for identification of compounds in complex samples due to the high chromatographic and mass spectral resolution provided. In subsequent data processing workflows, it is imperative to preserve this resolution to fully exploit the data. "Region of interest" (ROI) algorithms were introduced as a better alternative to equidistant binning for compressing HRMS data because they better preserve the mass spectral resolution. In this paper, we present a new ROI algorithm that improves on the selection of contiguous m/z traces, amongst others by introducing the concept of chromatographic filter, allows for an automated approach to optimise the admissible mass-to-charge deviation (δm/z) and can be used to match ROIs across multiple samples. The algorithm was tested on a LC-HRMS dataset comprised of 21 replicate injections of a wastewater effluent extract and assessed on its ability to correctly retrieve the ROI's relative to 57 compounds and match them across all injections. In summary, it achieved a ten-fold compression rate in on-disk storage at a noise threshold of 200 counts, and the median ROI length matched the observed chromatographic peak width (12-23 points). Correct ROI matching with a mass accuracy of 9ppm was observed for 52 compounds across all 21 injections with only one compound split between two adjacent m/z traces in six runs. Overall, the new algorithm performed favourably compared to the ROI algorithm currently used in the well-established ROI-MCR (multivariate curve resolution) workflow for deconvolution of HRMS chromatographic data.
Read full abstract