Abstract
Evolutionary and genetic algorithms are powerful tools for searching global optima of complex functions. An evolutionary approach, the MUSEUM (mutation and selection uncover models) programme, is applied to various QSAR data sets to prove the general applicability of this approach for variable selection in regression and PLS analyses. ‘Best’ regression models are found within seconds or a few minutes of calculation time, even for data sets including large numbers of variables. The MUSEUM algorithm starts from an arbitrary model and adds or eliminates variables to or from this model in a random manner. Any ‘better’ model defined by a certain fitness criterion is taken as a new breeding organism which is mutated by further variable additions, eliminations or exchanges. In this manner the models improve gradually until a global optimum or at least a good local optimum results. In most cases several different models are obtained from different runs. A systematic search for the best models indicates that in all cases the global optima and good local optima result from the evolutionary search. Most often the fit and cross-validation results of these regression models are better than the fit and cross-validation results of a PLS analysis which includes all variables of the data set. The variables contained in the best regression models are suitable as subsets for PLS analyses and some of these PLS results are even better than the best regression results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.