Convolutional neural networks (CNNs)—as a type of deep learning—have been specifically designed for highly heterogeneous data, such as natural images. Neuroimaging data, however, is comparably homogeneous due to (1) the uniform structure of the brain and (2) additional efforts to spatially normalize the data to a standard template using linear and non-linear transformations. To harness spatial homogeneity of neuroimaging data, we suggest here a new CNN architecture that combines the idea of hierarchical abstraction in CNNs with a prior on the spatial homogeneity of neuroimaging data. Whereas early layers are trained globally using standard convolutional layers, we introduce patch individual filters (PIF) for higher, more abstract layers. By learning filters in individual latent space patches without sharing weights, PIF layers can learn abstract features faster and specific to regions. We thoroughly evaluated PIF layers for three different tasks and data sets, namely sex classification on UK Biobank data, Alzheimer’s disease detection on ADNI data and multiple sclerosis detection on private hospital data, and compared it with two baseline models, a standard CNN and a patch-based CNN. We obtained two main results: First, CNNs using PIF layers converge consistently faster, measured in run time in seconds and number of iterations than both baseline models. Second, both the standard CNN and the PIF model outperformed the patch-based CNN in terms of balanced accuracy and receiver operating characteristic area under the curve (ROC AUC) with a maximal balanced accuracy (ROC AUC) of 94.21% (99.10%) for the sex classification task (PIF model), and 81.24% and 80.48% (88.89% and 87.35%) respectively for the Alzheimer’s disease and multiple sclerosis detection tasks (standard CNN model). In conclusion, we demonstrated that CNNs using PIF layers result in faster convergence while obtaining the same predictive performance as a standard CNN. To the best of our knowledge, this is the first study that introduces a prior in form of an inductive bias to harness spatial homogeneity of neuroimaging data.
Read full abstract