Empirical Case Studies in Attribute Noise Detection

T.M Khoshgoftaar,J Van Hulse

doi:10.1109/tsmcc.2009.2013815

Abstract

The quality of data is an important issue in any domain-specific data mining and knowledge discovery initiative. The validity of solutions produced by data-driven algorithms can be diminished if the data being analyzed are of low quality. The quality of data is often realized in terms of data noise present in the given dataset and can include noisy attributes or labeling errors. Hence, tools for improving the quality of data are important to the data mining analyst. We present a comprehensive empirical investigation of our new and innovative technique for ranking attributes in a given dataset from most to least noisy. Upon identifying the noisy attributes, specific treatments can be applied depending on how the data are to be used. In a classification setting, for example, if the class label is determined to contain the most noise, processes to cleanse this important attribute may be undertaken. Independent variables or predictors that have a low correlation to the class attribute and appear noisy may be eliminated from the analysis. Several case studies using both real-world and synthetic datasets are presented in this study. The noise detection performance is evaluated by injecting noise into multiple attributes at different noise levels. The empirical results demonstrate conclusively that our technique provides a very accurate and useful ranking of noisy attributes in a given dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Empirical Case Studies in Attribute Noise Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)

Lead the way for us

Journal: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)	Publication Date: Jul 1, 2009
Citations: 56

Similar Papers

Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm
Taghi M Khoshgoftaar ... Jason Van Hulse
Intelligent Data Analysis | VOL. 9
Taghi M Khoshgoftaar, et. al.Taghi M Khoshgoftaar ... Jason Van Hulse
09 Dec 2005
Intelligent Data Analysis | VOL. 9

The pairwise attribute noise detection algorithm
Jason D Van Hulse ... Haiying Huang
Knowledge and Information Systems | VOL. 11
Jason D Van Hulse, et. al.Jason D Van Hulse ... Haiying Huang
08 Apr 2006
Knowledge and Information Systems | VOL. 11

Issues in data mining: A comprehensive survey
Archana Purwar ... Sandeep Kumar Singh
-
Archana Purwar, et. al.Archana Purwar ... Sandeep Kumar Singh
01 Dec 2014
01 Dec 2014

NG<sup>2</sup>CE: Double neural gas based cluster ensemble framework
Hantao Chen ... Zhiwen Yu
-
Hantao Chen, et. al.Hantao Chen ... Zhiwen Yu
01 Jul 2012
01 Jul 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Empirical Case Studies in Attribute Noise Detection

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)