Abstract

A new and general approach to forming Structure Activity Relationships (SARs) is described. This is based on representing chemical structure by atoms and their bond connectivities in combination with the Inductive Logic Programming (ILP) algorithm Progol. Existing SAR methods describe chemical structure using attributes which are general properties of an object. It is not possible to map directly chemical structure to attribute-based descriptions, as such descriptions have no internal organisation. A more natural and general way to describe chemical structure is to use a relational description, where the internal construction of the description maps that of the object described. Our atom and bond connectivities representation is a relational description. ILP algorithms can form SARs with relational descriptions. We have tested the relational approach by investigating the SAR of 230 aromatic and heteroaromatic nitro compounds. These compounds had been split previously into two sub-sets, 188 compounds that were amenable to regression, and 42 that were not. For the 188 compounds, a SAR was found that was as accurate as the best statistical or neural network generated SARs. The Progol SAR has the advantages that it did not need the use of any indicator variables hand-crafted by an expert, and the generated rules were easily comprehensible. For the 42 compounds, Progol formed a SAR that was significantly (P < 0.025) more accurate than linear regression, quadratic regression, and back-propagation. This SAR is based on a new automatically generated structural alert for mutagenicity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call