Abstract
This paper explores widely the data preparation stage within the process of knowledge discovery and data mining via feature subset selection in the context of two very well-known neural models: radial basis function neural networks and multi-layer perceptron. It is known the best performance of wrapper attribute selection methods based on the evaluation measure provided by a classifier, although the temporal complexity of learning neural networks practically precludes the use of wrapper techniques, especially in complex databases with high dimensionality and a large number of labels. In this paper, we propose the use of the Naïve Bayes classifier as a fitness function within a semi-wrapper feature selection approach. The Naïve Bayes classifier is a good fast approach to a neural network and utilising it as a measure of goodness in a backward search on a ranking provides a specific attribute selection method for neural networks in complex data. The test-bed consists of 34 binary and multi-class classification problems and 7 feature selectors. Of these, there are 6 data sets with upwards of 5 classes. According to the reported accuracy results that have been supported by non-parametric statistical tests in different scenarios, our method has been shown to be very suitable for both kinds of neural networks. Moreover, the reduced feature-space is around 20% of the full attribute space. The speedup with the aforementioned semi-wrapper is very outstanding and its value fluctuates, on average, from about 1.5 with radial basis function neural networks to around 30 with multi-layer perceptron.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.