Understanding Spatial Context in Convolutional Neural Networks Using Explainable Methods: Application to Interpretable GREMLIN

Kyle A Hilburn

doi:10.1175/aies-d-22-0093.1

Kyle A Hilburn

Open Access

PDF Available

https://doi.org/10.1175/aies-d-22-0093.1

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Abstract Convolutional neural networks (CNNs) are opening new possibilities in the realm of satellite remote sensing. CNNs are especially useful for capturing the information in spatial patterns that is evident to the human eye but has eluded classical pixelwise retrieval algorithms. However, the black-box nature of CNN predictions makes them difficult to interpret, hindering their trustworthiness. This paper explores a new way to simplify CNNs that allows them to be implemented in a fully transparent and interpretable framework. This clarity is accomplished by moving the inner workings of the CNN out into a feature engineering step and replacing the CNN with a regression model. The specific example of the GOES Radar Estimation via Machine Learning to Inform NWP (GREMLIN) is used to demonstrate that such simplifications are possible and to show the benefits of the interpretable approach. GREMLIN translates images of GOES radiances and lightning into images of radar reflectivity, and previous research used explainable artificial intelligence (XAI) approaches to explain some aspects of how GREMLIN makes predictions. However, the Interpretable GREMLIN model shows that XAI missed several strategies, and XAI does not provide guarantees on how the model will respond when confronted with new scenarios. In contrast, the interpretable model establishes well-defined relationships between inputs and outputs, offering a clear mapping of the spatial context utilized by the CNN to make accurate predictions, and providing guarantees on how the model will respond to new inputs. The significance of this work is that it provides a new approach for developing trustworthy artificial intelligence models. Significance Statement Convolutional neural networks (CNNs) are very powerful tools for interpreting and processing satellite imagery. However, the black-box nature of their predictions makes them difficult to interpret, compromising their trustworthiness when applied in the context of high-stakes decision-making. This paper develops an interpretable version of a CNN model, showing that it has similar performance as the original CNN. The interpretable model is analyzed to obtain clear relationships between inputs and outputs, which elucidates the nature of spatial context utilized by CNNs to make accurate predictions. The interpretable model has a well-defined response to inputs, providing guarantees for how it will respond to novel inputs. The significance of this work is that it provides an approach to developing trustworthy artificial intelligence models.

Full Text