Abstract

The present work describes convenient and interpretable models by step-wise multiple linear regression (SW-MLR)and genetic algorithm- multiple linear regression (GA-MLR). These quantitative structure-retention relationship (QSRR) based strategies have been successfully used to predict the retention indices (RIs) of a series of natural compounds found in the essential oil of Pistacia lentiscus L. The dataset was divided into training set (51 compounds) and test set (25 compounds), randomly. The prediction capabilities of both approaches have been appraised by their exertion to test set compounds. The fitness statistics of the models were assessed to be satisfactory which resulted in accurate predictions. The best optimized models for SW-MLR and GA-MLR consisted 5 (X0, Qtot, Dv, HATS3e and MATS4e) and 6 (MATS2e, DELS, ATS4e, PW5, PCD and W) molecular descriptors, respectively. Our investigation revealed the superiority of SW-MLR (R2=0.96, and for training set; REP=3.7 for test set) standpoint against the best GA-MLR model (R2=0.955, for training set; REP=7.5 for test set) for estimation of RIs of similar or unknown compounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call