Abstract

Improved models for predicting viscosities at 20 °C were generated using three different methods for descriptor selection. Data set of 361 diverse organic molecules and their experimental viscosities were used for developing the models. Molecular properties are encoded by 822 initial descriptors computed by the CODESSA program. CODESSA, GFA and CROMRsel methods are capable of selecting good and facile viscosity models having only five descriptors. These methods are automated procedures for generation of simple multiregression (MR) models. All three methods produce excellent linear models, but the models obtained by the CROMRsel method are somewhat better. In addition, using the CROMRsel suite of programs a very good nonlinear MR model having five descriptors (two linear and three cross-product descriptors, R 2 = 0.908, S = 0.175) was obtained. Nonlinear models generated in this study show that the classical MR based methods can be efficiently used to obtain simple and very good nonlinear MR models. The best five-descriptor models selected in this study usually contain one geometrical (gravitational index) and one topological descriptor (Randic index of order 0), and three electrostatic descriptors which reflect the bonding properties of molecules, i.e. their capabilities to create (mainly) hydrogen bonds. Because of that, hydrogen-donors and hydrogen-acceptors surface areas, charges, total molecular surface areas, and maximum net atomic charges and state energies for oxygen atoms appear to be key factors for modeling the viscosity of organic molecules.

Highlights

  • Developing models for predicting different not measurable experimental properties of molecules is a growing field of research

  • A short description of these methods is given in Experimental section. We will compare these three methods for the model generation used in the field of QSPR in order to rank them according to their usefulness in developing: [1] good QSPR models, and [2] facile/straightforward models

  • Some of the descriptor selection methods used in modeling process (like Neural Network Ensemble (NNE))11 produce models that are not simple and that are difficult to interpret

Read more

Summary

Introduction

Developing models for predicting different not measurable experimental properties of molecules is a growing field of research. Parameters:R2 = 0.8536, Rcv2 = 0.8446, S = 0.221, Scv = 0.228, F = 414.1 a X = regression coefficient, ' X= error of regression coefficient; b note: descriptors having higher t-test value are more significant ones In Table 3 the best MR models containing one to five descriptors selected by the GFA method, which is incorporated in Cerius2 program package, are given.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call