Abstract

Associative Classification, a combination of two important and different fields (classification and association rule mining), aims at building accurate and interpretable classifiers by means of association rules. The process used to generate association rules is exponential by nature; thus in AC, researchers focused on the reduction of redundant rules via rules pruning and rules ranking techniques. These techniques take an important part in improving the efficiency; however, pruning may negatively affect the accuracy by pruning interesting rules. Further, these techniques are time consuming in term of processing and also require domain specific knowledge to decide upon the selection of the best ranking and pruning strategy. In order to overcome these limitations, in this research, an automata based solution is proposed to improve the classifier’s accuracy while replacing ranking and pruning. A new merging concept is introduced which used structure based similarity to merge the association rules. The merging not only help to reduce the classifier size but also minimize the loss of information by avoiding the pruning. The extensive experiments showed that the proposed algorithm is efficient than AC, Naive Bayesian, and Rule and Tree based classifiers in term of accuracy, space, and speed. The merging takes the advantages of the repetition in the rules set and keep the classifier as small as possible.

Highlights

  • Classification considers to be one of the main pillars in DM and ML [1, 2]

  • A new storage structure is necessary that can help reducing the size of dataset in order to make the processing less time consuming while improving the accuracy. Keeping in mind these shortcomings, we propose to replace ranking and pruning with our automata based

  • Automata were utilized for two purposes: a) as a storage structure in classification; and b) to replace the rule pruning and rule ranking phases of associative classification

Read more

Summary

INTRODUCTION

Classification considers to be one of the main pillars in DM and ML [1, 2]. It is a data analysis technique, used to categorize data into different classes based on some common characteristics or associations in the data. AC is based on ARM where, first, the strongest Class Association Rules (CAR) are discovered from dataset, followed by converting those rules into classifier model. Those stronger associations from the data, in the form of CAR, make the classifier more logical and improve accuracy. A new storage structure is necessary that can help reducing the size of dataset in order to make the processing less time consuming while improving the accuracy. Keeping in mind these shortcomings, we propose to replace ranking and pruning with our automata based.

ASSOCIATIVE CLASSIFICATION USING AUTOMATA
Class Association Rules Generation and Rules Pruning
Building Automata using CAR
10: Add rule to conflictRuleSet
Properties of Automata in ACA
CONFLICT RESOLUTION
CLASSIFICATION OF TEST INSTANCES
WEIGHTED RATIO MEASUREMENT
SELECTION OF DATASETS FOR EXPERIMENTS
RESULTS AND DISCUSSION
Small Discrete Dataset
Large Discrete Dataset
Small Continuous Datasets
Large Continuous Datasets
VIII. SELECTION OF ALGORITHMS FOR COMPARISON
COMPARISON OF ACA WITH OTHER CLASSIFIERS BASED ON ACCURACY
Accuracy Comparison of ACA with Associative Classifiers
Accuracy Comparison of ACA with Rules and Tree based Classifiers
Accuracy Comparison of ACA with Naive Bayesian
Time Complexity Analysis
Space Complexity Analysis
CONCLUSION
FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call