Abstract

High-throughput virtual screening is a means of accomplishing the goal of screening a library of molecules for potential drug activity, and the implementation of such virtual bioactivity screening relies on the development of predictive quantitative structure-activity relationship (QSAR) models. Three different approaches for feature selection for QSAR problems based on evolutionary algorithms (EA) are addressed in this chapter. These methods are based on common feature extraction with a genetic algorithm (GA) for a learning model, GA-scaled regression clustering, and GA-based feature selection from the correlation matrix. The chapter briefly explains the common GA-based method for feature selection in QSAR and expands on two novel approaches for feature selection. It also demonstrates a hybrid feature selection method combining GA-based feature selection methods with sensitivity analysis. A comparative benchmark for feature selection for an HIV-relevant QSAR model is also described. Although the feature selection methods are all GA-based, the predictive models are based on a back propagation-trained neural network and partial least squares. The goal of QSAR is to predict the bioactivity of molecules based on a set of descriptive features. The underlying assumption is that variations in biological activity can be correlated with characteristics in measured or calculated molecular properties. Several types of descriptors are traditionally used in QSAR investigations, including 2D, electrotopological, 3D, and transferable atom equivalent (TAE) descriptors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call