An Analysis on Attribute Selection and Token Formation used for Duplicate Record Detection

Krishna Kant Tiwari Krishna Kant Tiwari,Dr Qaim Mehdi Rizbi Dr Qaim Mehdi Rizbi

doi:10.29070/pv7aec32

Abstract

The data mining method relies heavily on data pre-processing. The data cleansing methods that work for some types of data may not work for others. Extensive experiments are conducted to analyze & assess a newly constructed method for attribute selection. The data cleaning processes involve reducing the amount of attributes to deal with noisy data & duplicate data. The experimental findings demonstrate that it is an extremely efficient and straightforward method for attribute selection by significantly reducing the attributes. Efficiently reducing the time required for subsequent data cleaning processes, such as token synthesis, record similarity, & deletion, is the primary goal of attribute selection for data cleaning. Smart tokens for data cleansing are formed using the token generation algorithm, which is appropriate for data that consists of numeric, alphabetic, & non-numerical elements. Duplicate data can be efficiently removed using token-based data cleaning. Attribute selection & token-based technique will both shorten the time required.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Analysis on Attribute Selection and Token Formation used for Duplicate Record Detection

Abstract

Talk to us

Similar Papers

More From: Journal of Advances in Science and Technology

Lead the way for us

Similar Papers

Normal Workflow and Key Strategies for Data Cleaning Toward Real-World Data: Viewpoint.
Manping Guo ... Mingbo Zhu
Interactive Journal of Medical Research | VOL. 12
Manping Guo, et. al.Manping Guo ... Mingbo Zhu
21 Sep 2023
Interactive Journal of Medical Research | VOL. 12

Multiple Data Quality Evaluation and Data Cleaning on Imprecise Temporal Data
Xiaoou Ding
-
Xiaoou DingXiaoou Ding
01 Jan 2018
01 Jan 2018

Fuzzy-Rough Sets Assisted Attribute Selection
Richard Jensen ... Qiang Shen
IEEE Transactions on Fuzzy Systems | VOL. 15
Richard Jensen, et. al.Richard Jensen ... Qiang Shen
01 Feb 2007
IEEE Transactions on Fuzzy Systems | VOL. 15

Data Preprocessing Toolkit : An Approach to Automate Data Preprocessing
Deepak Varma ... P Swathy
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07
Deepak Varma, et. al.Deepak Varma ... P Swathy
23 Mar 2023
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Analysis on Attribute Selection and Token Formation used for Duplicate Record Detection

Abstract

Talk to us

Similar Papers

More From: Journal of Advances in Science and Technology