A novel fitness function in genetic programming for medical data classification

Arvind Kumar,Nishant Sinha,Arpit Bhardwaj

doi:10.1016/j.jbi.2020.103623

Arvind Kumar, Nishant Sinha + Show 1 more

Open Access

https://doi.org/10.1016/j.jbi.2020.103623

Copy DOI

Journal: Journal of Biomedical Informatics	Publication Date: Nov 14, 2020
Citations: 30	License type: publisher-specific-oa

Affiliation: Bennett University

Abstract

In the last decade, machine learning (ML) techniques have been widely applied to identify different diseases. This facilitates an early diagnosis and increases the chance of survival. The majority of medical data-sets are unbalanced. Due to this, ML classification techniques give biased classification over the majority class. In this paper, a novel fitness function in Genetic Programming, for medical data classification has been proposed that handles the problem of unbalanced data. Four benchmark medical data-sets named chronic kidney disease (CKD), fertility, BUPA liver disorder, and Wisconsin diagnostic breast cancer (WDBC) have been taken from the University of California (UCI) machine learning repository. Classification is done using the proposed technique. The proposed technique achieved the best accuracy for CKD, WDBC, Fertility, and BUPA dataset as 100%, 99.12%, 85.0%, and 75.36% respectively, and the best AUC as 1.0, 0.99, 0.92, and 0.75 respectively. The result outcomes show an improvement over other GP and SVM methods that confirm the efficiency of our proposed algorithm.

Full Text