Abstract

Machine learning techniques are ever prevalent as datasets continue to grow daily. Associative classification (AC), which combines classification and association rule mining algorithms, plays an important role in understanding big datasets that generate a large number of rules. Clustering, on the other hand, can contribute by reducing the rule space to produce compact models. The above-mentioned facts were the main motivation for this research work. We propose a new distance (similarity) metric based on “direct” and “indirect” measures and explain the overall importance of this method, which can produce compact and accurate models. Specifically, we aim to employ agglomerative hierarchical clustering to develop new associative classification models that contain a lower number of rules. Furthermore, a new strategy (based on the cluster center) is presented to extract the representative rule for each cluster. Twelve real-world datasets were evaluated experimentally for accuracy and compactness, and the results were compared to those of previously established associative classifiers. The results show that our method outperformed the other algorithms in terms of classifier size on most of the datasets, while still being as accurate in classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.