Abstract

AbstractIn QSAR studies of large data sets, variable selection and model building is a difficult, time‐consuming and ambiguous procedure. While most often stepwise regression procedures are applied for this purpose, other strategies, like neural networks, cluster significance analysis or genetic algorithms have been used. A simple and efficient evolutionary strategy, including iterative mutation and selection, but avoiding crossover of regression models, is described in this work. The MUSEUM (Mutation and Selection Uncover Models) algorithm starts from a model containing any number of randomly chosen variables. Random mutation, first by addition or elimination of only one or very few variables, afterwards by simultaneous random additions, eliminations and/or exchanges of several variables at a time, leads to new models which are evaluated by an appropriate fitness function. In contrast to common genetic algorithm procedures, only the “fittest” model is stored and used for further mutation and selection, leading to better and better models. In the last steps of mutation, all variables inside the model are eliminated and all variables outside the model are added, one by one, to control whether this systematic strategy detects any mutation which still improves the model. After every generation of a better model, a new random mutation procedure starts from this model. In the very last step, variables not significant at the 95% level are eliminated, starting with the least significant variable. In this manner, “stable” models are produced, containing only significant variables. A comparison of the results for the Selwood data set (n = 31 compounds, k = 53 variables) with those obtained by other groups shows that more relevant models are derived by the evolutionary approach than by other methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.