Abstract
A novel descriptor selection scheme for Support Vector Machine (SVM) classification method has been proposed and its utility demonstrated using a skin sensitisation dataset as an example. A backward elimination procedure, guided by mean accuracy (the average of specificity and sensitivity) of a leave-one-out cross validation, is devised for the SVM. Subsets of descriptors were first selected using a sequential t-test filter or a Random Forest filter, before backward elimination was applied. Different kernels for SVM were compared using this descriptor selection scheme. The Radial Basis Function (RBF) kernel worked best when a sequential t-test filter was adopted. The highest mean accuracy, 84.9%, was obtained using SVM with 23 descriptors. The sensitivity and the specificity were as high as 93.1% and 76.6%, respectively. A linear kernel was found to be optimal when a Random Forest filter was used. The performance using 24 descriptors was comparable with a RBF kernel with a sequential t-test filter. As a comparison, Fisher's linear discriminant analysis (LDA) under the same descriptor selection scheme was carried out. SVM was shown to outperform the LDA.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.