Abstract
The violation detection of conditional functional dependencies in distributed environment has been a research problem giving inspiration to many researchers recently. A very few solutions were given in the recent past to handle conditional functional dependencies. Unfortunately, these are inappropriate in real time big data applications. This article mainly focuses on the big data solution to such type of problems. The proposed IMRCFDHBD algorithm reduces elapsed time and provides scalability with minimum data shipment. The result proves that the algorithm outperforms the state-of-the-art techniques in the big data scenarios.
Highlights
Nowadays, due to the evolution of mobile devices, sensors, cloud computing, and digitalisation, we are able to collect huge amounts of data
Fans incHor algorithm focused on minimizing the data shipment only, whereas our algorithm focused on elapsed time and scalability aspects, which are essential in big data applications
The results prove that IMRCFDHBD is an effective incremental algorithm, applicable to big data
Summary
Due to the evolution of mobile devices, sensors, cloud computing, and digitalisation, we are able to collect huge amounts of data. The algorithms in [25] leveraged the techniques of [15] to reduce data shipment when validating multiple CFDs, in particular. These are insufficient in real time big data scenarios. The proposed Incremental MapReduce based Conditional Functional Dependency violation detection algorithm for Horizontally partitioned Big Data (IMRCFDHBD) is an incremental algorithm, which uses horizontal partitioning. We compared it with incHor and other batch counterparts. Algorithm surpasses all existing techniques in performance It is good enough for real time big data applications
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have