Abstract
The problem of mining frequent closed patterns has received considerable attention recently as it promises to have much less redundancy compared to discovering all frequent patterns. Existing algorithms can presently be separated into two groups, feature (column) enumeration and row enumeration. Feature enumeration algorithms like CHARM and CLOSET+ are efficient for datasets with small number of features and large number of rows since the number of feature combinations to be enumerated is small. Row enumeration algorithms like CARPENTER on the other hand are more suitable for datasets (eg. bioinformatics data) with large number of features and small number of rows. Both groups of algorithms, however, will encounter problem for datasets that have large number of rows and features. In this paper, we describe a new algorithm called COBBLER which can efficiently mine such datasets . COBBLER is designed to dynamically switch between feature enumeration and row enumeration depending on the data characteristic in the process of mining. As such, each portion of the dataset can be processed using the most suitable method, making the mining more efficient. Several experiments on real-life and synthetic datasets show that COBBLER is an order of magnitude better than previous closed pattern mining algorithms like CHARM, CLOSET+ and CARPENTER.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.