Abstract

Optimisation of a spectrum-like structure representation via genetic algorithm (GA) is described. The final optimised structure representation of 28 molecules (flavonoid derivatives, inhibitors of the enzyme p56 lck protein tyrosine kinase) contains only 15 variables compared with the 120 ones of the initial spectrum-like representation. The fitness function in the variable reduction of the GA procedure were counterpropagation artificial neural network (ANN) models. Using one chromosome after another as a code for new representation, a new ANN model was trained and tested for each of them. The correlation coefficient r between the experimental biological activity and the value predicted by the ANN model for the test set of 14 compounds (not used in the training) was estimated. The obtained correlation coefficient r is used as the final fitness criterion in the selection and reproduction ability of the genetic procedure for generation of the new population. Due to the fact that the spectrum-like structure representation is reversible, each representation's variable can be back-traced to the structural feature. The consequence is that 15 variables selected by the GA optimisation can pinpoint the most relevant spatial directions (with the respect to the skeleton) most responsible for the biological activities of the entire series of the compounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call