Abstract

One of the important research issues in disease diagnosis is the selection of a subset of attributes that can produce the preferred output with a satisfactory level of accuracy. Therefore, the aim of this study is to improve accuracy the presence of oral cancer primary stage with elimination of the attributes that are strictly correlated with other already selected attributes. This study propose a hybrid features selection method based on a correlation evaluator and linear forward selection to address feature selection. Originally, 25 attributes from an oral cancer data set have been reduced to 14 features using proposed method feature selection in order to diagnose the oral cancer staging. Subsequently, seven classifiers: updatable Naive Bayes, multilayer perceptron, K-nearest neighbors, support vector machine, Rules-DTNB, Tree-J48, and Tree-Simple Chart are used in order to evaluate the efficiency of the features selection methods. All the evaluations are conducted in Waikato Environment Knowledge Explorer (WEKA) with tenfold cross validation. The empirical comparison shows that the subset of features generated from the proposed features selection methods with over-sampling techniques at preprocessing phases significantly improved the accuracy of the entire classifier algorithm used for the oral cancer data set with a mean accuracy of 96.53%. The implication of the study supports the suitable subset of variables in oral cancer diagnosis. Therefore, the future direction includes the consideration of using proposed feature subset to classify and generate the differential probabilities for stage diagnosis among oral cancer patients.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call