Abstract

Silent diseases is an umbrella term that captures a spectrum of chronic illnesses that produce no clinically obvious signs and are diagnosed at advanced stages when the damage is irreversible. Current diagnostic strategies of silent diseases depend on self-reported symptoms and observed behavior through extended periods of time, and until now there are no specific clinical tests to diagnose silent diseases. Scientific research suggests the importance of early diagnosis to restore the functionality and reduce diseases-related complications. Previous studies primarily focused on feature selection methods to aid in medical diagnosis. Traditional feature selection methods are primarily focused on correct classification and often ignore features’ costs; the cost of clinical tests required to acquire the feature value. However, in medical diagnosis, features have different associated costs. Because ignoring features’ costs may result in a high cost diagnostic strategy that cannot be used in practice, developing a low-cost diagnostic strategy remains a subject of much interest. In this paper, new Mixed Integer Programming (MIP) models, namely, Cost-sensitive Support Vector Machine (CS-SVM) and Cost-sensitive Multi-surface Method Tree (CS-MSMT) that allow for simultaneous selection of low-cost and informative features are proposed. The CS-SVM and CS-MSMT are superior because they have the ability to account for shared costs. The CS-SVM and CS-MSMT were modified to embed shared costs across feature groups, and are termed Discounted CS-SVM (dCS-SVM) and Discounted CS-MSMT (dCS-MSMT), respectively. Computationally effective algorithm that integrates aggressive bound tightening with the MIP formulation is proposed. To demonstrate the effectiveness of the proposed models, different analysis paradigms are conducted on six UCI medical datasets; Chronic Kidney Disease, Hepatitis, Heart Disease, Thyroid, Diabetes and Leukemia. The results demonstrate the efficiency and robustness of the CS-SVM and CS-MSMT (and consequently the dCS-SVM and dCS-MSMT) under various conditions. The CS-SVM and CS-MSMT improved accuracy by 10.3% and 3.4% and reduced costs by 94.3% and 72.4% in the leukemia dataset, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call