AbstractDue to the relative independence from the operational parameters, the linear retention indices (LRIs) are useful tool in gas chromatography-mass spectrometry (GC-MS) qualitative analysis. The aim of the current study was to develop a multiple linear regression (MLR) model for the prediction of LRIs as a function of selected molecular descriptors. Liquid injection GC-MS was used for the analysis of Essential oils (Rose, Lavender and Peppermint) separating the ingredients by a semi-standard non-polar stationary phase. As a result, a sum of 103 compounds were identified and their experimental LRIs were derived relying on reference measurements of a standard mixture of n-alkanes (from C8 to C20). As a next step, a set of molecular descriptors was generated for the distinguished chemical structures. Further, a stepwise MLR was applied for the selection of the significant descriptors (variables) which can be used to predict the LRIs. From an exploit set of over 2000 molecular descriptors, it was found that only 16 can be regarded as significant and independent variables. At this point split validation was applied: the identified compounds were randomly divided into training (85%) and validation (15%) sets. The training set (87 compounds) was used to derive two MLR models by applying i) the ‘enter’ algorithm (R2 = 0.9960, RMSЕ = 17) and ii) the ‘stepwise’ one (R2 = 0.9958, RMSЕ = 17). The predictive power was assessed by the validation set (16 compounds) as follows i) q2F1 = 0.9896, RMSE = 25 and ii) q2F1 = 0.9886, RMSE = 26, respectively. The adequateness of both regression approaches was further evaluated. Newly developed headspace-solid phase microextraction (HS-SPME) procedures in combination with GC-MS were used for an alternative analysis of the studied Essential oils. Twelve additional compounds, not previously detected by the liquid sample introduction mode of analysis, were identified for which the values of the significant descriptors were within the working range of the developed MLRs. For the last compounds, the LRIs were calculated and the experimental data was used as an external set for assessment of the regression models. The predictive power for both regression approaches was assessed as follows: Enter RMSE = 41, q2F2 = 0.9503 and Stepwise RMSE = 40, q2F2 = 0.9521.
Read full abstract