Thalassemia, a genetic blood disorder, presents a significant challenge in Sri Lanka due to its high prevalence. Traditional methods of identifying tha-lassemia carriers, such as genetic and blood testing, are both costly and time-consuming, and potentially not available for certain demographic groups. However, there haven’t been many studies done on the efficacy of data mining models for thalassemia carrier detection, therefore the field is still in its in fancy. As such, evaluating their accuracy and utility in clinical practice is crucial. This study aims to develop a time-efficient model to detect the β-thalassemia carriers, which can reduce the time to take a decision and develop the built model as a decision support tool. Eight blood parameters - includ-ing RBC, HGB, HCT, MCV, MCH, MCHC, RDW, and HbA2 were selected based on literature. Two model-fitting approaches were introduced, each un-der different data selection methods: Method 1: Model fitting before handling the class imbalance problem and Method 02: Model fitting with random over-sampling technique. Support Vector Machine (SVM) and Probabilistic Neural Network (PNN) models were utilized for β-thalassemia carrier detection. Method 2 exhibited superior performance, especially with the PNN Model 2, achieving an impressive 98.75% overall classification accuracy. Moreover, the implemented PNN Model 2 could be utilized as an efficient decision-support tool, offering both time and cost savings in identifying β-thalassemia carriers. Nonetheless, for further investigation, consulting a medical expert is recommended.
Read full abstract