Abstract

Preprocessing of data is ubiquitous, and choosing significant attributes has been one of the important steps in the prior processing of data. Feature selection is used to create a subset of relevant feature for effective classification of data. In a classification of high-dimensional data, the classifier usually depends on the feature subset that has been used for classification. The Relief algorithm is a popular heuristic approach to select significant feature subsets. The Relief algorithm estimates feature individually and selects top-scored feature for subset generation. Many extensions of the Relief algorithm have been developed. However, an important defect in the Relief-based algorithms has been ignored for years. Because of the uncertainty and noise of the instances used for measuring the feature score in the Relief algorithm, the outcome results will vacillate with the instances, which lead to poor classification accuracy. To fix this problem, a novel feature selection algorithm based on Chebyshev distance-outlier detection model is proposed called noisy feature removal-Relief, NFR-ReliefF in short. To demonstrate the performance of NFR-ReliefF algorithm, an extensive experiment, including classification tests, has been carried out on nine benchmarking high-dimensional datasets by uniting the proposed model with standard classifiers, including the naive Bayes, C4.5 and KNN. The results prove that NFR-ReliefF outperforms the other models on most tested datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.