Abstract
The rapid development of methods of design of new pharmaceuticals calls for new efficient computa� tional approaches that can reliably predict various types of biological activity of organic compounds to be synthesized. This is due to the fact that the available methods widely used to search for quantitative struc� ture-activity relationships (QSARs) have significant drawbacks. In particular, common methods of con� structing 3D QSARs, such as comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), underlying the stateoftheart approaches to the design of new phar� maceuticals are very sensitive to the dimensions, reso� lution, and spatial alignment of the hypothetical grid constructed around a molecule and used for approxi� mating the electrostatic, steric, and hydrophobic molecular fields, the potentials of the latter being cal� culated at the grid points as molecular structure descriptors (1-3). This leads to the ambiguity of the resulting 3D QSAR models and, hence, to the unreli� ability of the prediction based on these models. In this work, we propose a new method for con� structing 3D QSAR models, namely, the method of continuous molecular fields (MCMF). The basic idea of this approach is in direct analysis of continuous molecular fields rather than a discrete array of their potentials calculated at the points of the discrete grid of finite size (as in the standard CoMFA and CoMSIA methods). Such a description better cor� responds to the physical nature of molecular fields; therefore, we can expect better statistical character� istics of 3D QSAR models upon such a substitution. Until recently, it was impossible to use continuous molecular fields in the framework of statistical analysis since common statistical procedures are intended to operate only with finite and limited number of molec� ular descriptors. Therefore, we were interested to real� ize this idea on the basis of the latest statistical approaches, for example, the support vector machines (SVM), which is free of this limitation and can operate with an infinite number of variables (4). This is achieved by using socalled kernels (5). As is known, for any kernel, there must exist a linear vector space (referred to as the reproducing kernel Hilbert space) in which the former can be uniquely represented as the scalar product of the corresponding vectors. It is evi� dent that the scalar product of the molecular field potential values at the grid points is also a kernel (by definition). Inasmuch as an increase in the grid dimensions and a decrease in the grid cell size do not violate this property, the integral of the product of molecular fields taken over the entire physical space also remains a kernel, which can be used in appropri� ate statistical methods, such as support vector regres� sion. Thus, this offers possibilities for constructing sta� tistical models based on the description of molecular objects as continuous molecular fields. The basic element of the MCMF proposed in this work is the procedure of calculation of kernels. The use of these kernels will be exemplified by constructing QSARs in the framework of the statistical method of support vector regression (5).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.