Abstract

Truth discovery methods and rule-based data repairing methods are two classic lines of approaches to improve data quality in the field of database. Truth discovery methods resolve the multi-source conflicts for the same entity by estimating the reliabilities of different source, while rule-based data repairing methods resolve the inconsistencies among different entities using integrity constraints. However, both lines of methods suffer unsatisfactory performances due to the lacking of enough evidence. In this paper, we propose AutoRepair, a novel automatic multi-source data repairing approach to enrich the evidence by taking the advantages of truth discovery and data repairing. We use functional dependency, one of the most common types of constraints, to detect the violations, and use the source reliability as evidence to discover and repair the errors among these violations. At the same time, the repaired results are used to estimate the source reliability. As the source reliability is unknown in advance, we model the process as an iterative framework to ensure better performance. Extensive experiments are conducted on both simulated and real-world datasets. The results clearly demonstrate the advantages of our approach, which outperform both recent truth discovery and rule-based data repairing methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.