Convolutional Neural Network (CNN)-based models are prone to adversarial attacks, which present a significant hurdle to their reliability and robustness. The vulnerability of CNN-based models may be exploited by attackers to launch cyber-attacks. An attacker typically adds small, carefully crafted perturbations to original medical images. When a CNN-based model receives the perturbed medical image as input, it misclassifies the image, even though the added perturbation is often imperceptible to the human eye. The emergence of such attacks has raised security concerns regarding the implementation of deep learning-based medical image classification systems within clinical environments. To address this issue, a reliable defense mechanism is required to detect adversarial attacks on medical images. This study will focus on the robust detection of pneumonia in chest X-ray images through CNN-based models. Various adversarial attacks and defense strategies will be evaluated and analyzed in the context of CNN-based pneumonia detection. From earlier studies, it has been observed that a single defense mechanism is usually not effective against more than one type of adversarial attack. Therefore, this study will propose a defense mechanism that is effective against multiple attack types. A reliable defense framework for pneumonia detection models will ensure secure clinical deployment, facilitating radiologists and doctors in their diagnosis and treatment planning. It can also save time and money by automating routine tasks. The proposed defense mechanism includes a convolutional autoencoder to denoise perturbed Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) adversarial images, two state- of-the-art attacks carried out at five magnitudes, i.e., ε (epsilon) values. Two pre-trained models, VGG19 and VGG16, and our hybrid model of MobileNetV2 and DenseNet169, called Stack Model, have been used to compare their results. This study shows that the proposed defense mechanism outperforms state-of-the-art studies. The PGD attack using the VGG16 model shows a better attack success rate by reducing overall accuracy by up to 67%. The autoencoder improves accuracy by up to 16% against PGD attacks in both the VGG16 and VGG19 models.
Read full abstract