Density-based clustering of static and dynamic functional MRI connectivity features obtained from subjects with cognitive impairment

D. Rangaprakash,Toluwanimi Odemuyiwa,D. Narayana Dutt,Gopikrishna Deshpande

doi:10.1186/s40708-020-00120-2

Abstract

Various machine-learning classification techniques have been employed previously to classify brain states in healthy and disease populations using functional magnetic resonance imaging (fMRI). These methods generally use supervised classifiers that are sensitive to outliers and require labeling of training data to generate a predictive model. Density-based clustering, which overcomes these issues, is a popular unsupervised learning approach whose utility for high-dimensional neuroimaging data has not been previously evaluated. Its advantages include insensitivity to outliers and ability to work with unlabeled data. Unlike the popular k-means clustering, the number of clusters need not be specified. In this study, we compare the performance of two popular density-based clustering methods, DBSCAN and OPTICS, in accurately identifying individuals with three stages of cognitive impairment, including Alzheimer’s disease. We used static and dynamic functional connectivity features for clustering, which captures the strength and temporal variation of brain connectivity respectively. To assess the robustness of clustering to noise/outliers, we propose a novel method called recursive-clustering using additive-noise (R-CLAN). Results demonstrated that both clustering algorithms were effective, although OPTICS with dynamic connectivity features outperformed in terms of cluster purity (95.46%) and robustness to noise/outliers. This study demonstrates that density-based clustering can accurately and robustly identify diagnostic classes in an unsupervised way using brain connectivity.

Highlights

Since the successful emergence of functional neuroimaging, a new barrier has surfaced: can a strong correlation be established between brain activity and the cognitive state of an individual? can we accurately classify neurological diseases based on functional magnetic resonance imaging (fMRI)There are two major categories of machine learning classification techniques: supervised and unsupervised
With Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering, the average group-wise cluster purity with Static Functional Connectivity (SFC) and Dynamic Functional Connectivity (DFC) features were 75% and 87.88%, respectively; while Ordering Points to Identify the Clustering Structure (OPTICS) clustered with 93.18% cluster purity using SFC and 95.46% using DFC features
It is clear that, (i) OPTICS performed better than DBSCAN, (ii) DFC features resulted in higher performance than SFC features, (iii) OPTICS clustering using DFC features resulted in the overall best performance, and (iv) the performance with control and Alzheimer’s disease (AD) groups were higher than that with the intermediate early mild cognitive impairment (EMCI) and late Mild cognitive impairment (MCI) (LMCI) groups

Summary

Introduction

Can we accurately classify neurological diseases based on fMRI. There are two major categories of machine learning classification techniques: supervised and unsupervised. Supervised learning, commonly used in fMRI studies, involves splitting the dataset into training and test data. The classifier is ‘trained’ on this data to determine a generalized model ( ‘supervised’ learning). (2020) 7:19 accuracy is measured by testing the model on the test data with known labels. In unsupervised learning, patterns within the entire dataset are used to ‘cluster’ the data without any pre-assigned labels, and cluster purity is measured against the known groundtruth, post hoc, instead of an accuracy. As such, unsupervised learning is agnostic to pre-assigned labels, and determines inherent classes instead of fitting a model based on classes provided by us. The ‘cluster purity’ given by clustering is, in principle, the same as the ‘classification accuracy’ given by supervised classifiers (both give the percentage of correct classifications as against the known ground truth), we would use the term ‘cluster purity’ in this work, so as to highlight that this metric was obtained through unsupervised clustering, and not through conventional supervised classification

Methods

Results

Discussion

Conclusion