Abstract

Feature selection and feature extraction are the most important steps in classification and regression systems. Feature selection is commonly used to reduce the dimensionality of datasets with tens or hundreds of thousands of features, which would be impossible to process further. Recent example includes quantitative structure–activity relationships (QSAR) dataset including 1226 features. A major problem of QSAR is the high dimensionality of the feature space; therefore, feature selection is the most important step in this study. This paper presents a novel feature selection algorithm that is based on entropy. The performance of the proposed algorithm is compared with that of a genetic algorithm method and a stepwise regression method. The root mean square error of prediction in a QSAR study using entropy, genetic algorithm and stepwise regression using multiple linear regressions model for training set and test set were 0.3433, 0.3591 and 0.5500, 0.4326 and 0.6373, 0.6672, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.