Various aspects of retention index usage for GC-MS library search: A statistical investigation using a diverse data set

Dmitriy D Matyushin,Anastasia Yu Sholokhova,Anastasia E Karnaeva,Aleksey K Buryak

doi:10.1016/j.chemolab.2020.104042

Abstract

This work is devoted to the large-scale statistical evaluation of various aspects of using the retention index for GC-MS library search with a diverse data set. A search in a large library often does not give a correct compound even if a library contains it. One of the methods to improve a spectral library search procedure is to use the retention index information. The aim of this study is to explore some statistical peculiarities which can be helpful for development of automated software which uses a library search of diverse completely unknown compounds in a large database. A data set that was used in this work as a source of queries contains ~11 thousand spectra of compounds which belong to diverse chemical classes. Six equations for matching reference and experimental “retention index – spectrum” pairs were compared. It was found that good results can be obtained when a linear equation for similarity of pairs is used. Similarity of pairs is found as a sum of spectral similarity and of a product of a negative adjustable weight parameter and the absolute difference between reference and query retention indices. This equation performs equal or better than much more complex equations which contain two instead of one adjustable parameters. Widely used threshold-based approach, when candidates with high retention index deviation are rejected, performs worse than other equations. The use of predicted with neural networks retention indices as reference was also considered. Modern universal retention prediction models which are applicable to a wide variety of compounds are still quite inaccurate comparing with values from databases, but these predicted values allow to improve a library search as well. When predicted retention indices are used as reference, the linear equation for matching “retention index – spectrum” pairs also performs equal or better than other equations. The distribution of differences between query indices and reference indices (both calculated and experimental) was found close to exponential distribution near zero. The dependence of a fraction of correct identifications on the reference retention indices accuracy was studied. The addition of random noise with double exponential distribution to exact values was used to create “reference” retention indices with the predefined accuracy. The use of the molecular mass and molecular formula as additional constraints during a library search was also considered.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Various aspects of retention index usage for GC-MS library search: A statistical investigation using a diverse data set

Abstract

Talk to us

Similar Papers

More From: Chemometrics and Intelligent Laboratory Systems

Lead the way for us

Journal: Chemometrics and Intelligent Laboratory Systems	Publication Date: May 11, 2020
Citations: 20

Similar Papers

Gas Chromatographic Retention Index Prediction Using Multimodal Machine Learning
Dmitriy D Matyushin ... Aleksey K Buryak
IEEE Access | VOL. 8
Dmitriy D Matyushin, et. al.Dmitriy D Matyushin ... Aleksey K Buryak
01 Jan 2020
IEEE Access | VOL. 8

Retention index thresholds for compound matching in GC–MS metabolite profiling
Nadine Strehmel ... Joachim Kopka
Journal of Chromatography B | VOL. 871
Nadine Strehmel, et. al.Nadine Strehmel ... Joachim Kopka
08 May 2008
Journal of Chromatography B | VOL. 871

Use of boiling point–Lee retention index correlation for rapid review of gas chromatography-mass spectrometry data
William P Eckel ... Tobias Kind
Analytica Chimica Acta | VOL. 494
William P Eckel, et. al.William P Eckel ... Tobias Kind
23 Sep 2003
Analytica Chimica Acta | VOL. 494

Prediction of retention indices. VI: Isothermal and temperature-programmed retention indices, methylene value, functionality constant, electronic and steric effects
C.T Peng
Journal of Chromatography A | VOL. 1217
C.T PengC.T Peng
06 Feb 2010
Journal of Chromatography A | VOL. 1217

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Various aspects of retention index usage for GC-MS library search: A statistical investigation using a diverse data set

Abstract

Talk to us

Similar Papers

More From: Chemometrics and Intelligent Laboratory Systems