Abstract

Prediction of gas chromatographic retention indices based on compound structure is an important task for analytical chemistry. The predicted retention indices can be used as a reference in a mass spectrometry library search despite the fact that their accuracy is worse in comparison with the experimental reference ones. In the last few years, deep learning was applied for this task. The use of deep learning drastically improved the accuracy of retention index prediction for non-polar stationary phases. In this work, we demonstrate for the first time the use of deep learning for retention index prediction on polar (e.g., polyethylene glycol, DB-WAX) and mid-polar (e.g., DB-624, DB-210, DB-1701, OV-17) stationary phases. The achieved accuracy lies in the range of 16–50 in terms of the mean absolute error for several stationary phases and test data sets. We also demonstrate that our approach can be directly applied to the prediction of the second dimension retention times (GC × GC) if a large enough data set is available. The achieved accuracy is considerably better compared with the previous results obtained using linear quantitative structure-retention relationships and ACD ChromGenius software. The source code and pre-trained models are available online.

Highlights

  • Gas chromatographic retention index (RI) is a value that does not strongly depend on particular chromatographic conditions and characterizes the ability of a given stationary phase (SP) to retain a given molecule [1]

  • The predicted retention indices can be used as a reference in a mass spectrometry library search despite the fact that their accuracy is worse in comparison with the experimental reference ones

  • Some compounds were removed from these data sets according to the criteria given in the Methods section (Section 3.3). n-AAlkanes were removed from these data sets

Read more

Summary

Introduction

Gas chromatographic retention index (RI) is a value that does not strongly depend on particular chromatographic conditions and characterizes the ability of a given stationary phase (SP) to retain a given molecule [1]. RI can be used [7,8] as an additional constraint for the library search in gas chromatography-mass spectrometry (GC-MS). All available reference databases contain RI for less than 150,000 compounds. Mass spectra are available for 2–3 times greater number of compounds. Methods of mass spectra prediction and analysis for GC-MS identification without reference databases are available [9,10]. Predicted RI can be used as a reference for the GC-MS library search [7,11,12,13], in particular in metabolomic applications.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call