Abstract
The amount of multimedia data has grown rapidly because of improvements in data collection and storage technologies. The association rule mining (ARM) technique is a type of data mining method widely used to extract useful information from data warehouses. In real-world big data applications, fast and effective data mining algorithms are emerging as a valuable approach. In this paper, we propose DCE-Miner, a fast association rule mining algorithm with low memory requirements based on the MapReduce framework. In the precomputation phase, we split large datasets into equal-sized smaller ones using data division method. In the frequent K-itemsets mining phase, the mappers read the small datasets and distribute the data to reducers based on the closed set characteristics associated with each partition. The reducers use bitmaps to accelerate the computation speed and store the possible frequent 2-itemsets to reduce future computation. Extensive experimental results show that on large-scale datasets with up to 40 million transactions, DCE-Miner achieves better performance and is more robust with respect to dataset sizes and support level than are the current algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.