Abstract

This research aims to develop a model to enhance lymphatic diseases diagnosis by the use of random forest ensemble machine-learning method trained with a simple sampling scheme. This study has been carried out in two major phases: feature selection and classification. In the first stage, a number of discriminative features out of 18 were selected using PSO and several feature selection techniques to reduce the features dimension. In the second stage, we applied the random forest ensemble classification scheme to diagnose lymphatic diseases. While making experiments with the selected features, we used original and resampled distributions of the dataset to train random forest classifier. Experimental results demonstrate that the proposed method achieves a remark-able improvement in classification accuracy rate.

Highlights

  • Nowadays, Computer-Aided Diagnosis (CAD) applications have become one of the key research topics in medical biometrics diagnostic tasks

  • The difference between this article and other articles that address the same topic is that a strong ensemble classifier scheme has been created by combining particle swarm optimization (PSO) feature selection and random forest decision tree methods, which yields more efficient results than any of the other methods tested in this paper

  • Madden [13] proposed a comparative study between Naïve Bayes, Tree Augmented Naïve Bayes (TAN) and General Bayesian network (GBN) classifier, with K2 search and GBN with hill-climbing search in which they scored an accuracy of 82.16%, 81.07%, 77.46% and 75.06% respectively

Read more

Summary

Introduction

Computer-Aided Diagnosis (CAD) applications have become one of the key research topics in medical biometrics diagnostic tasks. Dimensionality reduction procedure aims to reduce computational complexity with the possible advantages of enhancing the overall classification performance It includes eliminating insignificant features before model implementation, which makes screening tests faster, more practical and less costly and this is an important requirement in medical applications [6]. A CAD system based on random forest ensemble classifier is introduced to improve the efficiency of the classification accuracy for lymph disease diagnosis. The difference between this article and other articles that address the same topic is that a strong ensemble classifier scheme has been created by combining PSO feature selection and random forest decision tree methods, which yields more efficient results than any of the other methods tested in this paper. The article commences with the suggested feature selection techniques and the random forest ensemble classifier.

Feature Selection
Particle Swarm Optimization for Feature Selection
Information Gain Ratio Attribute Evaluation
Symmetrical Uncertainty
Random Forest Ensemble Classification Algorithm
Simple Random Sampling
Performance Measures
Findings
Experimental Study
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.