Abstract

Frequent Itemset mining (FIM) concept and limitations are explored in this paper, for the purpose of extracting unknown hidden patterns as itemsets from the transactional database. Since candidate generation and support calculations are the major tasks in FIM, the major limitations of FIM are tackled, (i) huge possible frequent itemsets are generated as candidates at each pass (ii) Data base scan at each pass to calculate the support of the generated itemsets (iii) generated itemsets are highly sensitive to the minimum support threshold. SS-FIM a single scan algorithm is to deal with the above limitations. However, several unnecessary itemsets are being hashed in the buckets. To overcome the limitations, a partition based approach is proposed in this paper. The proposed approach, PSSFIM, takes single scan of the database to identify frequent itemsets. The unique feature of PSSFIM allow to generate size of candidate itemsets independent on the minimum support. It allows the candidates in hash that are possible for frequent, which intuitively reduces the cost in terms of verifying the support of generated candidates. It is compared with SS-FIM and Apriori with the standard datasets. The results show that the PSSFIM is good at the comparison of SS-FIM and Apriori.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.