Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

João Maroco,Dina Silva,Ana Rodrigues,Isabel Santana,Alexandre De Mendonça,Manuela Guerreiro

doi:10.1186/1756-0500-4-299

João Maroco, Dina Silva + Show 4 more

Open Access

https://doi.org/10.1186/1756-0500-4-299

Copy DOI

Abstract

BackgroundDementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test.ResultsPress' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5.ConclusionsWhen taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

Highlights

Dementia and cognitive impairment associated with aging are a major medical and social concern
In this paper we evaluated the sensitivity, specificity, overall classification accuracy, area under the ROC and Press’ Q of data mining classifiers like Neural Networks (Multilayer Perceptrons and Radial Basis Networks), Support Vector Machines, Classification Trees and Random Forests as compared to the traditional Linear, Quadratic Discriminant Analysis and Logistic Regression in the prediction of the evolution into dementia of 400 elderly people with Mild Cognitive Impairment
The smallest mean ranks were observed for Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Radial Basis Function (RBF), Classification and Regression Tree (CART) and Quick Unbiased Efficient Statistical Tree (QUEST)

Summary

Introduction

Dementia and cognitive impairment associated with aging are a major medical and social concern. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. It would be important to improve the value of neuropsychological tests to predict the progression of MCI patients to dementia This can be achieved at a clinical level by increasing the number of patients with longer clinical follow-ups. Research has been steadily building on the accuracy and efficiency of data mining, with classifiers like Neural Networks (NN), Support Vector Machines (SVM), Classification Trees (CT) and Random Forests (RF) used for medical prediction and classification tasks [13,14,19,20,21,22,23,24,25,26,27]. In medical contexts, sensitivity (the ability to predict the condition when the condition is present), specificity (the ability to predict the absence of the condition when the condition is not present) as well as the classifier discriminant power (as estimated from the area under the Receiver Operating Characteristic (ROC) curve) are key features that must be considered when comparing classifiers and diagnostic methods

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Research Notes	Publication Date: Aug 17, 2011
Citations: 370	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes

Lead the way for us

Similar Papers

Evaluation of nine machine learning methods for estimating daily land surface radiation budget from MODIS satellite data
Shaopeng Li ... Kun Jia
International Journal of Digital Earth | VOL. 15
Shaopeng Li, et. al.Shaopeng Li ... Kun Jia
14 Oct 2022
International Journal of Digital Earth | VOL. 15

Prediction of Dementia Patients: A Comparative Approach Using Parametric Versus Nonparametric Classifiers
João Maroco ... Isabel Santana
-
João Maroco, et. al.João Maroco ... Isabel Santana
01 Jan 2013
01 Jan 2013

Identification of Brake Fluid Brands, New and Used Brake Fluid with Discriminant Analysis Based on Near-Infrared Transmittance Spectroscopy.
Li-Hong Tan ... Yong He
Spectroscopy and Spectral Analysis | VOL. 36
Li-Hong Tan, et. al.Li-Hong Tan ... Yong He
01 Oct 2016
Spectroscopy and Spectral Analysis | VOL. 36

Prediction of delayed graft function after kidney transplantation: comparison between logistic regression and machine learning methods.
Alexander Decruyenaere ... Tom Dhaene
BMC Medical Informatics and Decision Making | VOL. 15
Alexander Decruyenaere, et. al.Alexander Decruyenaere ... Tom Dhaene
14 Oct 2015
BMC Medical Informatics and Decision Making | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes