Abstract
ABSTRACTLearning a classifier from imbalanced data is a challenging problem in Machine learning. A dataset is said to be imbalanced when the number of instances belonging to one class is much less than the number of instances belonging to the other class. Classifiers that proves efficient on standard data fail when the data is imbalanced as they are over trained by the majority class instances. Since class imbalance is a common characteristic of real-world data, the need for better classifiers becomes essential. This paper proposes a novel instance-based classification algorithm called Weighted Pattern Matching based Classification (PMC+) for classifying imbalanced data. PMC+ classifies unlabelled instances by computing the absolute difference between the feature values of the instances in the dataset and the unlabelled instance. PMC+ employs a simple classification procedure with weights and shows reasonably good performance. To improve the performance of PMC+, Fireworks based Feature and Weight Selection algorithm based on the idea of PMC+ has been proposed. PMC+ is evaluated on 44 binary imbalanced datasets and 15 multiclass imbalanced datasets. Although PMC+ does not employ a resampling or cost-sensitive method, experiments show that PMC+ is effective for classification of imbalanced data. The results of the experiments were validated using various non-parametric statistical tests.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.