Abstract

Many applications such as intelligent tutoring system (ITS) use data that are better represented as binary data. This paper presents a novel algorithm called MBER (Mining Binary Data Efficiently by Reduced AND operations) for finding frequent itemsets in a binary dataset using matrix algebra operations. Frequent itemsets are sets of items in a transactional database that occur together frequently (defined by a user-given threshold value called minimum support). Existing algorithms that operate on binary data, such as ABBM, generate frequent itemsets by performing exhaustive AND operations using brute force method. MBER, on the other hand, generates frequent itemsets using a novel technique in which it first uses matrix algebra operations to find those transactions that have m common items in them (called as potential transactions) and then performs AND operations on only such potential transactions. This reduces the total number of AND operations required considerably (by less than a quarter) and thereby improves the efficiency of the algorithm. MBER also shows a significant improvement over traditional algorithms that generate frequent itemsets, such as Apriori, by eliminating the need to (i) scan the database more than once and (ii) to generate large number of candidate itemsets. This paper concludes by a proof of correctness of MBER and a discussion on evaluating it.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.