Abstract

A novel framework for inverse quantitative structure–activity relationships (inverse QSAR) has recently been proposed and developed using both artificial neural networks and mixed integer linear programming. However, classes of chemical graphs treated by the framework are limited. In order to deal with an arbitrary graph in the framework, we introduce a new model, called a two-layered model, and develop a corresponding method. In this model, each chemical graph is regarded as two parts: the exterior and the interior. The exterior consists of maximal acyclic induced subgraphs with bounded height, the interior is the connected subgraph obtained by ignoring the exterior, and the feature vector consists of the frequency of adjacent atom pairs in the interior and the frequency of chemical acyclic graphs in the exterior. Our method is more flexible than the existing method in the sense that any type of graphs can be inferred. We compared the proposed method with an existing method using several data sets obtained from PubChem database. The new method could infer more general chemical graphs with up to 50 non-hydrogen atoms. The proposed inverse QSAR method can be applied to the inference of more general chemical graphs than before.

Highlights

  • Computer-aided design of chemical structures is one of the key topics in chemoinformatics

  • In Stage 5, before we formulate an mixed integer linear programming (MILP) for inferring a target chemical graph G † for each instance I, we reduce the input layer of an artificial neural network (ANN) N constructed in Stage 3 so that the input layer consists of input nodes that correspond to the descriptors used in the specification ( GC, σint, σce ) of the instance I, i.e., we remove any input nodes in N

  • The framework of designing chemical graphs using ANNs and MILP has been proposed [23] as a basis of a total system of the QSAR and the inverse of QSAR, where the inverse of a prediction function produced by an ANN is solved by an MILP

Read more

Summary

Introduction

Computer-aided design of chemical structures is one of the key topics in chemoinformatics. Activity relationships (inverse QSAR), which seek chemical structures having desired chemical activities under some constraints In this framework, chemical compounds are usually represented as vectors of real or integer numbers, which are often called descriptors in chemoinformatics and correspond to feature vectors in machine learning. Chemical compounds are usually represented as vectors of real or integer numbers, which are often called descriptors in chemoinformatics and correspond to feature vectors in machine learning Using these chemical descriptors, various heuristic and statistical methods have been developed for inverse QSAR [1,2,3]. Even inference is a challenging task because it is NP-hard (computationally difficult) except for some simple cases [9] Due to this inherent difficulty, most existing methods for inverse QSAR do not guarantee optimal or exact solutions

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call