Abstract

Non-targeted analysis provides a comprehensive approach to analyze environmental and biological samples for nearly all chemicals present. One of the main shortcomings of current analytical methods and workflowsis that they are unable to provide anyquantitative information constituting an important obstacle in understanding environmental fate and human exposure. Herein, we present an in silico quantification method using mahine-learningfor chemicals analyzed using electrospray ionization (ESI). We considered three data sets from different instrumental setups: (i) capillary electrophoresis electrospray ionization-mass spectrometry (CE-MS) in positive ionization mode (ESI+), (ii) liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QTOF/MS) in ESI+ and (iii) LC-QTOF/MS innegative ionization mode (ESI-). We developed and applied two different machine-learning algorithms: a random forest (RF) and an artificial neural network (ANN) to predict the relative response factors (RRFs) of different chemicals based on their physicochemical properties. Chemical concentrations can then be calculated by dividing the measured abundance of a chemical, as peak area or peak height, by its corresponding RRF. We evaluated our models and tested their predictive power using 5-fold cross-validation (CV) andy randomization. Both the RF and the ANN models showed great promise in predicting RRFs. However, the accuracy of the predictions was dependent on the data set composition and the experimental setup. For the CE-MS ESI+ data set, the best model predicted measured RRFs with a mean absolute error (MAE) of 0.19 log units and a cross-validation coefficient of determination (Q2) of 0.84 for the testing set. For the LC-QTOF/MS ESI+ data set, the best model predicted measured RRFs with an MAE of 0.32 and a Q2 of 0.40. For the LC-QTOF/MS ESI- data set, the best model predicted measured RRFs with a MAE of 0.50 and a Q2 of 0.20. Our findings suggest that machine-learning algorithms can be used for predicting concentrations of nontargeted chemicals with reasonable uncertainties, especially in ESI+, while the application on ESI- remains a more challenging problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call