Abstract

Classification is one of the most fundamental and well-known tasks in data mining. Class imbalance is the most challenging issue encountered when performing classification, i.e. when the number of instances belonging to the class of interest (minor class) is much lower than that of other classes (major classes). The class imbalance problem has become more and more marked while applying machine learning algorithms to real-world applications such as medical diagnosis, text classification, fraud detection, etc. Standard classifiers may yield very good results regarding the majority classes. However, this kind of classifiers yields bad results regarding the minority classes since they assume a relatively balanced class distribution and equal misclassification costs. To overcome this problem, we propose, in this paper, a novel associative classification algorithm called Association Rule-based Classification for Imbalanced Datasets (ARCID). This algorithm aims to extract significant knowledge from imbalanced datasets by emphasizing on information extracted from minor classes without drastically impacting the predictive accuracy of the classifier. Experimentations, against five datasets obtained from the UCI repository, have been conducted with reference to four assessment measures. Results show that ARCID outperforms standard algorithms. Furthermore, it is very competitive to Fitcare which is a class imbalance insensitive algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.