Abstract

Effectivedata analysis and data mining are based on data availability and data quality. Data cleaning is a commonly used technique to improve data quality. Instance-level data cleaning is an important part of data cleaning. The focus is on the comparison and analysis of the detection and cleaning methods of attributes and recorded values in the instance-level data cleaning technology, and the experimental analysis of the repeated record cleaning methods. This paper introduces the application field of data cleaning technology represented by the electrical engineering field combined with the application situation, and provides valuable selection suggestions for the characteristics of different data sets and the applicable instancelevel data cleaning technology. Summarizing and analyzing the existing detection and cleaning technology methods, it is concluded that instance-level data cleaning has a lot of research and development space in long text, unstructured data and specific fields. Finally, the challenges and development directions of the instance-level data cleaning technology are prospected.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call