Abstract

A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements. This article proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive performance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model architecture, able to identify individual features of importance within a large feature domain space. Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to provide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size. The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification.

Highlights

  • Dementia: a decline in mental ability severe enough to interfere with daily life

  • FULL FEATURE RESULTS The performance of ECNN as a pure classification model was assessed on the full set of features in comparison to a traditional classification methodology with combined feature ranking capability, random forest (RF)

  • A major consideration is the larger variation in predictive performance of ECNN as compared to RF. Such variation was found during testing to be caused in part from the use of entropy regularization settling into perhaps a sub-par local minima of sparse weights producing inferior performing model snapshots affecting overall stability during the final prediction aggregation of the ensemble models

Read more

Summary

Introduction

Dementia: a decline in mental ability severe enough to interfere with daily life. The primary cause of which being Alzheimer’s diseases making up 60-80% of cases [1]–[4]. Other causes include vascular dementia, thyroid problems and vitamin deficiencies [5]. Current estimates indicate 47.5 million individuals living with dementia in the world with predictions showing the figure to triple by 2050 [6], [7]. Around 100,000 individuals with dementia die each year [8], with a worldwide cost of 818 billion US Dollars in 2015 [9]. Dementia poses a significant increase in risk due to continued degradation of mental ability. As such, coupled with a generally higher level of comorbidity as compared to

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call