The computation of optimal subset repairs

Dongjing Miao,Zhipeng Cai,Xiangyu Gao,Jianzhong Li,Xianmin Liu

doi:10.14778/3407790.3407809

Abstract

Computing an optimal subset repair of an inconsistent database is becoming a standalone research problem and has a wide range of applications. However, it has not been well-studied yet. A tight inapproximability bound of the problem computing optimal subset repairs is still unknown, and there is still no existing algorithm with a constant approximation factor better than two. In this paper, we prove a new tighter inapproximability bound of the problem computing optimal subset repairs. We show that it is usually NP-hard to approximate it within a factor better than 17/16. An algorithm with an approximation ratio (2 - 1/2 σ-1 )is developed, where σ is the number of functional dependencies. It is the current best algorithm in terms of approximation ratio. The ratio can be further improved if there are a large amount of quasi -Turán clusters in the input database. Plenty of experiments are conducted on real data to examine the performance and the effectiveness of the proposed approximation algorithms in real-world applications.

Full Text