At present many algorithms for mining association rules have been proposed, but most of them are only suitable for discovering specific frequent itemsets from characteristic data sets on the appointed environments, namely, these algorithms are not general enough when mining association rules. In this paper, a general framework based on composite granules for mining association rules is proposed, which is a general data mining model without appointed restriction from frequent itemsets, data sets or mining environments and so on. An iterative method is efficiently applied to the general mining framework for discovering frequent itemsets, which adopts repartitioning frequent attributes to iteratively reconstruct the mixed radix information system for reducing a relational database. In order that the framework for discovering frequent itemsets has a generality, in discussing and establishing the general mining framework, this paper introduces a novel conception and data model, namely, a mixed radix information system is applied to describe a relational database, and a composite granules is used to build a specific relationship between an information system and a mixed radix information system, which can hold the same extension and simultaneously exist in two different information systems. The mixed radix information system can help the general framework to reduce information data and improve the performance of the framework for generating frequent itemsets. The composite granules model can create a relationship between an information granule and a digital information granule, and help the framework for computing the support to avoid reading the database repeatedly or using the complex data structure. Finally, a new taxonomy is presented to verify the generality and the high efficiency of the mining framework and all the experiments based on the taxonomy indicate that the general mining framework has the required generality, and the performance of the framework is better than these classical mining frameworks.
Read full abstract