Skin infections caused by pathogens such as bacteria and fungi are common and can lead to serious health complications if not properly managed. Accurate classification of these infections is crucial for effective treatment and management. This study focuses on classifying two skin diseases, Chickenpox and Shingles, using a Decision Tree algorithm applied to an imbalanced dataset sourced from Kaggle. The dataset, which is imbalanced by nature, was split into training (80%) and testing (20%) subsets. Pre-processing involved segmentation using Thresholding to isolate regions of interest and feature extraction using Hu Moments to capture shape characteristics of the lesions. The dataset was scaled to ensure that all features had a mean of 0 and variance of 1. The classifier's performance was evaluated using 5-fold cross-validation, yielding a mean accuracy of 66.06%, with precision, recall, and F1-scores indicating moderate performance. The study highlights the challenges posed by imbalanced datasets and the limitations of the Decision Tree algorithm in this context. The results underscore the importance of proper pre-processing and feature extraction but also suggest the need for more advanced classification techniques and data balancing methods. This research contributes to the field by providing a detailed methodology and comprehensive evaluation metrics, offering insights into the application of machine learning for medical image classification. Future work should focus on improving classifier performance through data augmentation, advanced feature extraction, and exploring other machine learning models better suited for imbalanced datasets.
Read full abstract