Support vector machine (SVM) algorithm was applied to predict the gas chromatographic (GC) relative retention times (RRTs) for 126 polybrominated diphenyl ether (PBDE) congeners on 7 stationary phases. A total of 151 topological and connectivity indices descriptors were derived from E-dragon software. Genetic algorithm (GA) coupled with multiple linear regression (MLR) was used to select optimal subsets from large-size molecular descriptors. Overall support vector regression predicting training sets correlation coefficients R2 are greater than 0.996, except for the CP-Sil19 column, in which Q2loo (correlation coefficient of leave-one-out cross validation) and test sets correlation coefficients R2 are larger than 0.992. The excellent statistical parameters reveal that the models are robust and have high internal and external predictive capability.
Read full abstract