Abstract

Big Data is a large-sized and complex dataset, which cannot be managed using traditional data processing tools. Mining process of big data is the ability to extract valuable information from these large datasets. Association rule mining is a type of data mining process, which is indented to determine interesting associations between items and to establish a set of association rules whose support is greater than a specific threshold. The classical association rules can only be extracted from binary data where an item exists in a transaction, but it fails to deal effectively with quantitative attributes, through decreasing the quality of generated association rules due to sharp boundary problems. In order to overcome the drawbacks of classical association rule mining, we propose in this research a new neutrosophic association rule algorithm. The algorithm uses a new approach for generating association rules by dealing with membership, indeterminacy, and non-membership functions of items, conducting to an efficient decision-making system by considering all vague association rules. To prove the validity of the method, we compare the fuzzy mining and the neutrosophic mining. The results show that the proposed approach increases the number of generated association rules.

Highlights

  • The term ‘Big Data’ originated from the massive amount of data produced every day

  • We proceeded to a comparison between fuzzy mining and neutrosophic mining algorithms, and we found out that the number of generated association rules increased in neutrosophic mining

  • The comparison depends on the number of generated association rules in a different min-support threshold

Read more

Summary

Introduction

The term ‘Big Data’ originated from the massive amount of data produced every day. 1 billion queries, Facebook registers more than 800 million updates, and YouTube counts up to 4 billion views, and the produced data grows with 40% every year. Other sources of data are mobile devices and big companies. The produced data may be structured, semi-structured, or unstructured. Most of the big data types are unstructured; only 20% of data consists in structured data. There are four dimensions of big data:

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call