Abstract

Record linkage is a typical two-class recognition problem in data mining. To improve its classification performance of the problem, this paper proposes to apply three-way classification to identify uncertain points (regions) for further clerical investigation in decision-making. The detailed three-way decision process is realized by a two-phase approach. During the first phase, an information granule is constructed to describe the uncertain region in the data space. In the second phase, the constructed granule is utilized to discriminate between certain points (those with a high likelihood of belonging to one of the classes) and uncertain points (viz. those requiring clerical attention). For uncertain points, manual investigation is realized; for certain points, the generic binary classifier is applied for classification. Synthetic data and publicly available data are used to demonstrate the performance of the proposed approach. Finally, the proposed approach is shown effective in applications involving real-world record linkage data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call