Retention Index Units Research Articles

In gas chromatography–mass spectrometry-based untargeted metabolomics, metabolites are identified by comparing mass spectra and chromatographic retention time with reference databases or standard materials. In that sense, machine learning has been used to predict the retention time of metabolites lacking reference data. However, the retention time prediction of trimethylsilyl derivatives of metabolites, typically analyzed in untargeted metabolomics using gas chromatography, has been poorly explored. Here, we provide a rationalized framework for machine learning-based retention time prediction of trimethylsilyl derivatives of metabolites in gas chromatography. We compared different machine learning paradigms, in addition to exploring the influence of the computational molecular structure representation to train the prediction models: fingerprint class and fingerprint calculation software. Our study challenged predicted retention time when using chemical ionization and electron impact ionization sources in simulated and real cases, demonstrating a good correct identity ranking capability by machine learning, despite observing a limited false identity filtering power in cases where a spectrum or a monoisotopic mass match to multiple candidates. Specifically, machine learning prediction yielded median absolute and relative retention index (relative retention time) errors of 37.1 retention index units and 2%, respectively. In addition, fingerprint class and fingerprint calculation software, as well as the molecular structural similarity between the training and test or real case sets, showed to be critical modulators of the prediction performance. Finally, we leveraged the structural similarity between the training and test or real case set to determine the probability that the prediction error is below a specific threshold. Overall, our study demonstrates that predicted retention time can provide insights into the true structure of unknown metabolites by ranking from the most to the least plausible molecular identity, and sets the guidelines to assess the confidence in metabolite identification using predicted retention time data.

Read full abstract

The goal of many metabolomic studies is to identify the molecular structure of endogenous molecules that are differentially expressed among sampled or treatment groups. The identified compounds can then be used to gain an understanding of disease mechanisms. Unfortunately, despite recent advances in a variety of analytical techniques, small molecule (<1000 Da) identification remains difficult. Rarely can a chemical structure be determined from experimental "features" such as retention time, exact mass, and collision induced dissociation spectra. Thus, without knowing structure, biological significance remains obscure. In this study, we explore an identification method in which the measured exact mass of an unknown is used to query available chemical databases to compile a list of candidate compounds. Predictions are made for the candidates using models of experimental features that have been measured for the unknown. The predicted values are used to filter the candidate list by eliminating compounds with predicted values substantially different from the unknown. The intent is to reduce the list of candidates to a reasonable number that can be obtained and measured for confirmation. To facilitate this exploration, we measured data and created models for two experimental features; MS Ecom₅₀ (the energy in electronvolts required to fragment 50% of a selected precursor ion) and HPLC retention index. Using a data set of 52 compounds, Ecom₅₀ models were developed based on both Molconn and CODESSA structural descriptors. These models gave r² values of 0.89 to 0.94 depending on the number of inputs, the modeling algorithm chosen, and whether neutral or protonated structures were used. The retention index model was developed with 400 compounds using a back-propagation artificial neural network and 33 Molconn structure descriptors. External validation gave a v² = 0.87 and standard error of 38 retention index units. As a test of the validity of the filtering approach, the Ecom₅₀ and retention index models, along with exact mass and collision induced dissociation spectra matching, were used to identify 1,3-dicyclohexylurea in human plasma. This compound was not previously known to exist in human biofluids and its elemental formula was identical to 315 other candidate compounds downloaded from PubChem. These results suggest that the use of Ecom₅₀ and retention index predictive models can improve nontargeted metabolite structure identification using HPLC/MS derived structural features.

Read full abstract

Retention Index Units Research Articles

Related Topics

Articles published on Retention Index Units

Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites.

Normalization of LC-MS mycotoxin determination using the N-alkylpyridinium-3-sulfonates (NAPS) retention index system

Gradient boosting for the prediction of gas chromatographic retention indices

Surface fitting for calculating the second dimension retention index in comprehensive two-dimensional gas chromatography mass spectrometry

Chromatographic efficiency of polar capillary columns applied for the analysis of fatty acid methyl esters by gas chromatography.

A regression model for calculating the second dimension retention index in comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry

Optimizing the relationship between chromatographic efficiency and retention times in temperature-programmed gas chromatography.

Optimizing artificial neural network models for metabolomics and systems biology: an example using HPLC retention index data.

Gas Chromatographic and Mass Spectrometric Characterization of Trimethylsilyl Derivatives of Some Terpene Alcohol Phenylpropenoids

Development of Ecom50 and Retention Index Models for Nontargeted Metabolomics: Identification of 1,3-Dicyclohexylurea in Human Serum by HPLC/Mass Spectrometry

The Robustness and Comparability of a Novel Rapid Reversed‐phase HPLC Drug‐screening Method Compared with Existing Systems

A method of calculating the second dimension retention index in comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry

Retention index thresholds for compound matching in GC–MS metabolite profiling

Contribution to linearly programmed temperature gas chromatography: Further application of the Van den Dool–Kratz equation, and a new utilization of the Sadtler retention index library

Conversion of programmed-temperature retention indices from one set of conditions to another

An accurate and easy procedure to obtain isothermal Kováts retention indices in gas chromatography

Retention in gas–liquid chromatography with a polyethylene oxide stationary phase: Molecular simulation and experiment

Prediction of gas chromatographic retention indices of a diverse set of toxicologically relevant compounds

Mosaic increments for predicting the gas chromatographic retention data of the chlorobenzenes

Temperature effects on the retention of n-alkanes and arenes in helium–squalane gas–liquid chromatography : Experiment and molecular simulation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Retention Index Units Research Articles

Related Topics

Articles published on Retention Index Units

Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites.

Normalization of LC-MS mycotoxin determination using the N-alkylpyridinium-3-sulfonates (NAPS) retention index system

Gradient boosting for the prediction of gas chromatographic retention indices

Surface fitting for calculating the second dimension retention index in comprehensive two-dimensional gas chromatography mass spectrometry

Chromatographic efficiency of polar capillary columns applied for the analysis of fatty acid methyl esters by gas chromatography.

A regression model for calculating the second dimension retention index in comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry

Optimizing the relationship between chromatographic efficiency and retention times in temperature-programmed gas chromatography.

Optimizing artificial neural network models for metabolomics and systems biology: an example using HPLC retention index data.

Gas Chromatographic and Mass Spectrometric Characterization of Trimethylsilyl Derivatives of Some Terpene Alcohol Phenylpropenoids

Development of Ecom50 and Retention Index Models for Nontargeted Metabolomics: Identification of 1,3-Dicyclohexylurea in Human Serum by HPLC/Mass Spectrometry

The Robustness and Comparability of a Novel Rapid Reversed‐phase HPLC Drug‐screening Method Compared with Existing Systems

A method of calculating the second dimension retention index in comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry

Retention index thresholds for compound matching in GC–MS metabolite profiling

Contribution to linearly programmed temperature gas chromatography: Further application of the Van den Dool–Kratz equation, and a new utilization of the Sadtler retention index library

Conversion of programmed-temperature retention indices from one set of conditions to another

An accurate and easy procedure to obtain isothermal Kováts retention indices in gas chromatography

Retention in gas–liquid chromatography with a polyethylene oxide stationary phase: Molecular simulation and experiment

Prediction of gas chromatographic retention indices of a diverse set of toxicologically relevant compounds

Mosaic increments for predicting the gas chromatographic retention data of the chlorobenzenes

Temperature effects on the retention of n-alkanes and arenes in helium–squalane gas–liquid chromatography : Experiment and molecular simulation