Increased interpretation of deep learning models using hierarchical cluster-based modelling

Elise Lunde Gjelsvik,Kristin Tøndel

doi:10.1371/journal.pone.0295251

Elise Lunde Gjelsvik, Kristin Tøndel

https://doi.org/10.1371/journal.pone.0295251

Copy DOI

Export

Save

Cite

Journal: PLOS ONE	Publication Date: Dec 7, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Norwegian University of Life Sciences

Abstract
Full-Text
Similar Papers

Abstract

Listen

Linear prediction models based on data with large inhomogeneity or abrupt non-linearities often perform poorly because relationships between groups in the data dominate the model. Given that the data is locally linear, this can be overcome by splitting the data into smaller clusters and creating a local model within each cluster. In this study, the previously published Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) procedure was extended to deep learning, in order to increase the interpretability of the deep learning models through local modelling. Hierarchical Cluster-based Convolutional Neural Networks (HC-CNNs), Hierarchical Cluster-based Recurrent Neural Networks (HC-RNNs) and Hierarchical Cluster-based Support Vector Regression models (HC-SVRs) were implemented and tested on spectroscopic data consisting of Fourier Transform Infrared (FT-IR) measurements of raw material dry films, for prediction of average molecular weight during hydrolysis and a simulated data set constructed to contain three clusters of observations with different non-linear relationships between the independent variables and the response. HC-CNN, HC-RNN and HC-SVR outperformed HC-PLSR for the simulated data set, showing the disadvantage of PLSR for highly non-linear data, but for the FT-IR data set there was little to gain in prediction ability from using more complex models than HC-PLSR. Local modelling can ease the interpretation of deep learning models through highlighting differences in feature importance between different regions of the input or output space. Our results showed clear differences between the feature importance for the various local models, which demonstrate the advantages of a local modelling approach with regards to interpretation of deep learning models.

Full Text