Fast Frequent Item Mining from Big Data using Map Reduce and Bit Vectors

Thirumaran S, Et Al

doi:10.17762/turcomat.v12i2.1525

Abstract

One of the most important areas that are constantly being focused recently is the big data and mining frequent patterns from them is an interesting vertical which is perpetually being evolved and gained plethora of attention among the research fraternities. Generally, the data is mined with the aid of Apriori based algorithms, tree based algorithm and hash based algorithm but most of these existing algorithms suffer many snags and limitations. This paper proposes a new method that overrides and overcomes the most common problems related to speed, memory consumption and search space. The algorithm named Dual Mine employs binary vector representation and vertical data representations in the map reduce and then discover the most patterns from the large data sets. The Dual mine algorithm is then compared with some of the existing algorithms to determine the efficiency of the proposed algorithm and from the experimental results it is quite evident that the proposed algorithm “Dual Mine” outscored the other algorithms by a big magnitude with respect to speed and memory.

Highlights

The main purpose of the data mining is to unearth the previously unknown patterns hidden beneath the raw data [1]
The most common task that is hugely popular in the data mining vertical is frequent pattern mining where the most frequently occurring items are found
Frequent itemset mining has increased tremendous significance among the examination society generally since the business houses have become globalized. It is basic for the business houses to tap the accessible data assets conveniently to the full degree to advance their items all inclusive

Summary

Introduction

The main purpose of the data mining is to unearth the previously unknown patterns hidden beneath the raw data [1]. The most common task that is hugely popular in the data mining vertical is frequent pattern mining where the most frequently occurring items are found (market basket analysis, frequently purchased commodities by the consumers, frequently visited web pages in a website) The pioneer in this frequent itemset mining is carried out by Srikanthagarwal who proposed the Apriori algorithm [2]. Data mining has become an essential service that can decode and unearth the cloaked patterns and data present clueless in the raw data into human readable and understandable information for a wider usage. It has a wide scope of usage in the field of marketing, bioengineering, gene technologies, finance, and engineering. According to the authors David Hand, Mannila and Smyth [4] data mining is defined as, “The analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner”

Big Data

Background of the Paper

Scope of the Paper

Challenges in the Paper

Motivation

Map Reduce

Proposed Approach

10. Procedure to Generate Bit Vector

11. Experimental Evaluation

12. Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Turkish Journal of Computer and Mathematics Education	Publication Date: Apr 11, 2021
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Fast Frequent Item Mining from Big Data using Map Reduce and Bit Vectors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Turkish Journal of Computer and Mathematics Education

Lead the way for us

Similar Papers

Frequent Itemset Mining Based on Development of FP-growth Algorithm and Use MapReduce Technique
Dima Mufti Alchawafa ... Zakria Mahrousa
Mağallaẗ ittiḥād al-ğāmiʿāt al-ʿarabiyyaẗ li-l-dirāsāt wa-al-buḥūṯ al-handasiyyaẗ | VOL. 28
Dima Mufti Alchawafa, et. al.Dima Mufti Alchawafa ... Zakria Mahrousa
31 Mar 2021
Mağallaẗ ittiḥād al-ğāmiʿāt al-ʿarabiyyaẗ li-l-dirāsāt wa-al-buḥūṯ al-handasiyyaẗ | VOL. 28

Parallel mining frequent patterns over big transactional data in extended mapreduce
Hui Chen ... Jie Zhong
-
Hui Chen, et. al.Hui Chen ... Jie Zhong
01 Dec 2013
01 Dec 2013

WITHDRAWN: Comparative Research on Active Learning of Big Aata based on Mapreduce and Spark
Zhang Ruihong ... Hu Zhihua
Microprocessors | VOL. -
Zhang Ruihong, et. al.Zhang Ruihong ... Hu Zhihua
01 Nov 2020
Microprocessors | VOL. -

Privacy-Preserving Frequent Pattern Mining from Big Uncertain Data
Carson K Leung ... Alfredo Cuzzocrea
-
Carson K Leung, et. al.Carson K Leung ... Alfredo Cuzzocrea
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Frequent Item Mining from Big Data using Map Reduce and Bit Vectors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Turkish Journal of Computer and Mathematics Education