Abstract
Several methods of extracting association rules have been reported. A new evolutionary computation method named Genetic Network Programming (GNP) has also been developed recently and its efectiveness is shown for small datasets. However, it has not been tested for large datasets, particularly in datasets with a large number of attributes. The aim of this paper is to extract association rules from large and dense datasets using GNP considering a real world database with a huge number of attributes. We propose a new method where a large database is divided into many small datasets, then each GNP deals with one dataset having attributes with appropiate size, which was selected randomly from a large dataset and generated genetically. These GNPs are processed in parallel. We then propose some new genetic operations to improve the number of rules extracted and their quality as well. The proposed method improves remarkably on simulations.Fig. 1 shows the architecture of the proposed method. We use the CLIENT/SERVER model. CLIENT side carries out preprocessing of large database, assignment of files to each server, rule checking, and genetic operations on files. SERVER side carries out processing of each file using conventional GNP based mining method independently. The features and advantages of the proposed method are the following: Rule extraction is done in parallel. Each file generates its local pool of the rules. Files or datasets are treated as individuals in order to do new genetic operations over them and improve the rule extraction. Extracted rules are stored in a global pool. The rules are verified to avoid redundancy among them and it is assured that only new rules are stored.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.