Abstract

In predictive healthcare data analytics, high accuracy is both vital and paramount as low accuracy can lead to misdiagnosis, which is known to cause serious health consequences or death. Fast prediction is also considered an important desideratum particularly for machines and mobile devices with limited memory and processing power. For real-time health care analytics applications, particularly the ones that run on mobile devices, such traits (high accuracy and fast prediction) are highly desirable. In this paper, we propose to use an ensemble regression technique based on CLUB-DRF, which is a pruned Random Forest that possesses these features. The speed and accuracy of the method have been demonstrated by an experimental study on three medical data sets of three different diseases.

Highlights

  • Random Forest (RF) has proven its effectiveness as a classification and a regression method in a variety of applications [10]

  • In [11], a new method termed CLUB-DRF was introduced to select diverse decision trees drawn from groups of similar trees, to form a pruned Random Forest ensemble that is much smaller than the initial and traditional RF ensemble [7], and yet, performs at least as good as

  • In predictive healthcare data analytics applications, it is imperative for such applications to be as accurate as possible to minimise misdiagnosis which can be fatal sometimes

Read more

Summary

Introduction

Random Forest (RF) has proven its effectiveness as a classification and a regression method in a variety of applications [10]. In [11], a new method termed CLUB-DRF was introduced to select diverse decision trees drawn from groups of similar trees (i.e. clusters of trees), to form a pruned Random Forest ensemble that is much smaller than the initial and traditional RF ensemble [7], and yet, performs at least as good as. The premise is that grouping of similar classifiers in clusters according to their classification patterns, and choosing a representative classifier (or more) of each cluster can result in a pruned and more diversified ensemble

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call