Abstract

A popular Association Rule Mining algorithm called Apriori algorithm helps in finding various frequent itemsets in the database. The constraints for finding these itemsets are given by the user in terms of support - measured by the proportion of transactions in which an itemset appears, and confidence - measured by the proportion of transactions with an itemset, in which another itemset also appears. The problem with this algorithm is that it is highly iterative and thus its efficiency rapidly decreases with increase in size or dimension of the dataset. Our project increases its efficiency with the help of openMP threads. We use data decomposition to split the transaction database into various parts, each taken by a thread to find the support count of all the candidate itemsets for all the transactions assigned to that particular thread. To give an example of the application, this project is used to determine the probability of the occurrence of a forest fire. Here, the transaction database can consist of various occurrences of natural phenomena, in which a few transactions also have the forest fire phenomenon, which means that it has occurred in the presence of the other itemsets in the transaction. Hence, if a new transaction is taken from the user, then the probability (or confidence) that a forest fire occurs, given this transaction, is calculated.

Highlights

  • The constraints for finding these itemsets are given by the user in terms of support - measured by the proportion of transactions in which an itemset appears, and confidence - measured by the proportion of transactions with an itemset, in which another itemset appears

  • We use data decomposition to split the transaction database into various parts, each taken by a thread to find the support count of all the candidate itemsets for all the transactions assigned to that particular thread

  • The transaction database can consist of various occurrences of natural phenomena, in which a few transactions have the forest fire phenomenon, which means that it has occurred in the presence of the other itemsets in the transaction

Read more

Summary

Introduction

A common way of inferring something from the data is to understand the relationship between the different instances in the data or tuples in the table This particular form of data mining is called Association Rules Mining (ARM), where associations between the different itemsets in the database are found with the help of various algorithms. The constraints for finding these itemsets are given by the user in terms of support - measured by the proportion of transactions in which an itemset appears, and confidence - measured by the proportion of transactions with an itemset, in which another itemset appears The problem with this algorithm is that it is highly iterative and its efficiency rapidly decreases with increase in size or dimension of the dataset. We use data decomposition to split the transaction database into various parts, each taken by a thread to find the support count of all the candidate itemsets for all the transactions assigned to that particular thread

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.