Abstract

In the realm of machine vision, the convolutional neural network (CNN) is a frequently used and significant deep learning method. It is challenging to comprehend how predictions are formed since the inner workings of CNNs are sometimes seen as a black box. As a result, there has been an increase in interest among AI experts in creating AI systems that are easier to understand. Many strategies have shown promise in improving the interpretability of CNNs, including Class Activation Map (CAM), Grad-CAM, LIME, and other CAM-based approaches. These methods do, however, have certain drawbacks, such as architectural constraints or the requirement for gradient computations. We provide a simple framework termed Adaptive Learning based CAM (Adaptive-CAM) to take advantage of the connection between activation maps and network predictions. This framework includes temporarily masking particular feature maps. According to the Average Drop-Coherence-Complexity (ADCC) metrics, our method outperformed Score-CAM and another CAM-based activation map strategy in Residual Network-based models. With the exception of the VGG16 model, which witnessed a 1.94% decline in performance, the performance improvement spans from 3.78% to 7.72%. Additionally, Adaptive-CAM generates saliency maps that are on par with CAM-based methods and around 153 times superior to other CAM-based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call