Abstract

Functional dependencies (FDs) are widely applied in data management tasks. Since FDs on data are usually unknown, FD discovery techniques are studied for automatically finding hidden FDs from data. In this paper, we develop techniques to dynamically discover FDs in response to changes on data. Formally, given the complete set <tex>$\Sigma$</tex> of minimal and valid FDs on a relational instance <tex>$r$</tex>, we aim to find the complete set <tex>$\Sigma^{\prime}$</tex> of minimal and valid FDs on <tex>$r\oplus\Delta r$</tex>, where <tex>$\Delta r$</tex> is a set of tuple insertions and deletions. Different from the batch approaches that compute <tex>$\Sigma^{\prime}$</tex> on <tex>$r\oplus\Delta r$</tex> from scratch, our dynamic method computes <tex>$\Sigma^{\prime}$</tex> in response to <tex>$\triangle\uparrow$</tex>. by leveraging the known <tex>$\Sigma$</tex> on <tex>$r$</tex>, and avoids processing the whole of <tex>$r$</tex> for each update from <tex>$\Delta r$</tex>. We tackle dynamic FD discovery on <tex>$r\oplus\Delta r$</tex> by dynamic hitting set enumeration on the difference-set of <tex>$r\oplus\Delta r$</tex>. Specifically, (1) leveraging auxiliary structures built on <tex>$r$</tex>, we first present an efficient algorithm to update the difference-set of <tex>$r$</tex> to that of <tex>$r\oplus\Delta r$</tex>. (2) We then compute <tex>$\Sigma^{\prime}$</tex>, by recasting dynamic FD discovery as dynamic hitting set enumeration on the difference-set of <tex>$r\oplus\Delta r$</tex> and developing novel techniques for dynamic hitting set enumeration. (3) We finally experimentally verify the effectiveness and efficiency of our approaches, using real-life and synthetic data. The results show that our dynamic FD discovery method outperforms the batch counterparts on most tested data, even when <tex>$\Delta r$</tex> is up to 30 &#x0025; of <tex>$r$</tex>.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call