Abstract

Data quality is of major concern in today’s information world. Poor data quality impacts information based businesses negatively. Conditional functional dependencies (CFDs) are extensions of functional dependencies (FDs) with semantics of data values and are used as quality assessment rules. Conditional functional dependencies contribute to the improvement of data quality by detecting data inconsistencies. CFDs extracted from data can be used as rules for data cleaning. Any violation of CFDs can be identified as inconsistencies and could be repaired by applying the rules. As mining traditional functional dependencies is computationally intensive, mining CFDs is still more costly. In this paper, we discuss an information theoretic approach (ITCFD) to detect CFDs, which is scalable to large datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call