Abstract

Data have been accumulated to wait for being analyzed in real world. But the imperfection of data complicates the analysis process. According to “garbage in, garbage out”, model built on such data will mislead the following study. Multiple empirical studies have showed that noise in dataset dramatically decrease the classification accuracy and increase the complexity of classification. Therefore, the problem of noise in classification is always the focus in machine learning and data mining. At the same time, noise is uncertain, so the problem is also a difficult and open problem. For systematically studying the problem, we summarize and analyze the main researches from the aspects of noise model, method of handling noise and algorithms of handling noise. Based on the past and current work, we discuss some new directions in solving the problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call