Abstract

Liquid chromatography-high resolution mass spectrometry (LC-HRMS) and gas chromatography-high resolution mass spectrometry (GC-HRMS) have revolutionized analytical chemistry among many other disciplines. These advanced instrumentations allow to theoretically capture the whole chemical universe that is contained in samples, giving unimaginable opportunities to the scientific community. Laboratories equipped with these instruments produce a lot of data daily that can be digitally archived. Digital storage of data opens up the opportunity for retrospective suspect screening investigations for the occurrence of chemicals in the stored chromatograms. The first step of this approach involves the prediction of which data is more appropriate to be searched. In this study, we built an optimized multi-label classifier for predicting the most appropriate instrumental method (LC-HRMS or GC-HRMS or both) for the analysis of chemicals in digital specimens. The approach involved the generation of a baseline model based on the knowledge that an expert would use and the generation of an optimized machine learning model. A multi-step feature selection approach, a model selection strategy, and optimization of the classifier’s hyperparameters led to a model with accuracy that outperformed the baseline implementation. The models were used to predict the most appropriate instrumental technique for new substances. The scripts are available at GitHub and the dataset at Zenodo.

Highlights

  • Analytical environmental chemistry focuses on the occurrence of chemicals in environmental samples [1] and the development of new analytical methods for their determination [2,3]

  • The objective of our study was (i) to create a training and test set based on knowledge gained by the experts of NORMAN, (ii) to build a model with the highest accuracy and lowest possible complexity and (iii) apply the model to chemicals of the NORMAN Substance database to predict the type of data (LC-high-resolution mass spectrometry (HRMS) or gas chromatographyhigh resolution mass spectrometry (GC-HRMS) or both) to be investigated

  • A dataset was mined from the website of NORMAN Suspect List Exchange [31] with the objective to model the appropriate instrumental method (GC-HRMS, Liquid chromatography-high resolution mass spectrometry (LC-HRMS) or both)

Read more

Summary

Introduction

Analytical environmental chemistry focuses on the occurrence of chemicals ( known as emerging contaminants) in environmental samples [1] and the development of new analytical methods for their determination [2,3]. Traditional analytical methods focus on a list of preselected contaminants. This trend changed during the last decade after the introduction of high-resolution mass spectrometry (HRMS) detectors [4]. The term “LC” includes many techniques (reversed-phase, HILIC, ion-exchange chromatography). Reverse-phase liquid chromatography is the most frequently used LC separation technique for the analysis of semi-polar and polar contaminants of emerging concern. The analysis of a sample by reversed-phase LC-HRMS ( on referred to as LC-HRMS) and by GC-HRMS theoretically allows the detection of a very wide chemical universe that is contained in a given sample given analytical limitations (e.g., detection limits, sensitivity, matrix interferences, etc.)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call