Abstract

In the last decade, machine learning (ML) techniques have been widely applied to identify different diseases. This facilitates an early diagnosis and increases the chance of survival. The majority of medical data-sets are unbalanced. Due to this, ML classification techniques give biased classification over the majority class. In this paper, a novel fitness function in Genetic Programming, for medical data classification has been proposed that handles the problem of unbalanced data. Four benchmark medical data-sets named chronic kidney disease (CKD), fertility, BUPA liver disorder, and Wisconsin diagnostic breast cancer (WDBC) have been taken from the University of California (UCI) machine learning repository. Classification is done using the proposed technique. The proposed technique achieved the best accuracy for CKD, WDBC, Fertility, and BUPA dataset as 100%, 99.12%, 85.0%, and 75.36% respectively, and the best AUC as 1.0, 0.99, 0.92, and 0.75 respectively. The result outcomes show an improvement over other GP and SVM methods that confirm the efficiency of our proposed algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call