Learning Transferable Perturbations for Image Captioning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples cannot transfer well to other models. To generate adversarial examples faster and stronger, we propose to learn the perturbations by a generative model that is governed by three novel loss functions. Image feature distortion loss is designed to maximize the encoded image feature distance between original images and the corresponding adversarial examples at the image domain, and local-global mismatching loss is introduced to separate the mapping encoding representation of the adversarial images and the ground true captions from a local and global perspective in the common semantic space as far as possible cross image and caption domain. Language diversity loss is to make the image captions generated by the adversarial examples as different as possible from the correct image caption at the language domain. Extensive experiments show that our proposed generative model can efficiently generate adversarial examples that successfully generalize to attack image captioning models trained on unseen large-scale datasets or with different architectures, or even the image captioning commercial service.

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.1117/1.jei.31.6.063034
Generating traceable adversarial text examples by watermarking in the semantic space
  • Nov 26, 2022
  • Journal of Electronic Imaging
  • Mingjie Li + 2 more

The adversarial examples have been proven to reveal the vulnerability of the deep neural networks (DNNs) model, which can be used to evaluate the performance and further improve the robustness of the model. Because text data is discrete, it is more difficult to generate adversarial examples in the natural language processing (NLP) domain than in the image domain. One of the challenges is that the generated adversarial text examples should maintain the correctness of grammar and the semantic similarity compared with the original texts. In this paper, we propose an adversarial text generation model, which generates high-quality adversarial text examples through an end-to-end model. Moreover, the adversarial text examples generated by our proposed model are embedded with watermarks, which can mark and trace the source of the generated adversarial text examples and prevent the model from being maliciously or illegally used. The experimental results show that the attack success rates of the proposed model can still reach higher than 88% even on the AG’s News dataset where generating adversarial text examples is more difficult. And the quality of adversarial text examples generated by the proposed model is higher than that of the baseline models. At the same time, because of the generated adversarial text examples are embedded with strong robust watermarks, the model can be better protected.

  • Book Chapter
  • Cite Count Icon 28
  • 10.1007/978-3-030-30508-6_54
Evaluating Defensive Distillation for Defending Text Processing Neural Networks Against Adversarial Examples
  • Jan 1, 2019
  • Marcus Soll + 3 more

Adversarial examples are artificially modified input samples which lead to misclassifications, while not being detectable by humans. These adversarial examples are a challenge for many tasks such as image and text classification, especially as research shows that many adversarial examples are transferable between different classifiers. In this work, we evaluate the performance of a popular defensive strategy for adversarial examples called defensive distillation, which can be successful in hardening neural networks against adversarial examples in the image domain. However, instead of applying defensive distillation to networks for image classification, we examine, for the first time, its performance on text classification tasks and also evaluate its effect on the transferability of adversarial text examples. Our results indicate that defensive distillation only has a minimal impact on text classifying neural networks and does neither help with increasing their robustness against adversarial examples nor prevent the transferability of adversarial examples between neural networks.

  • Dissertation
  • 10.33915/etd.8028
Deep Models for Improving the Performance and Reliability of Person Recognition
  • Jan 1, 2021
  • Sobhan Soleymani

Deep models have provided high accuracy for different applications such as person recognition, image segmentation, image captioning, scene description, and action recognition. In this dissertation, we study the deep learning models and their application in improving the performance and reliability of person recognition. This dissertation focuses on five aspects of person recognition: (1) multimodal person recognition, (2) quality-aware multi-sample person recognition, (3) text-independent speaker verification, (4) adversarial iris examples, and (5) morphed face images. First, we discuss the application of multimodal networks consisting of face, iris, fingerprint, and speech modalities in person recognition. We propose multi-stream convolutional neural network architectures to incorporate person recognition traits introducing three multimodal frameworks: multi-level abstraction, generalized compact bilinear pooling, and quality-aware multi-sample multimodal fusion. Then, a novel cross-device text-independent speaker verification architecture which consists of spectro-temporal and prosodic features is introduced. Through intensive experimental setups the performance of each proposed framework is studied. Although biometric recognition systems are fast becoming part of security applications, these systems are still vulnerable to image manipulations. To study the reliability of deep models in person recognition, we focus on adversarial examples and morphed images. We introduce adversarial examples for iris recognition framework with non-targeted and targeted attacks and study the possibility of fooling an iris recognition system in white-box and black-box frameworks.Then, we present defense strategies to detect adversarial iris examples. These defense strategies are based on wavelet domain denoising of the input examples by investigating each wavelet sub-band. Finally, we study the morphed face images in which a facial reference image can be verified as two or more separate identities. Here, a novel differential morph attack detection framework using a deep Siamese network is proposed. Then, we improve the performance utilizing landmark and appearance disentanglement through contrastive representations.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/bibm47256.2019.8983217
How Robust is Your Automatic Diagnosis Model?
  • Nov 1, 2019
  • Ke Wang + 3 more

Automatic diagnosis based on clinical notes has become a popular research field recently, and many proposed deep learning models have achieved competitive performance in diseases inference. However, previous research reveals that deep learning models are susceptible to negligibly perturbed inputs named adversarial examples, which contradicts with the safety and reliability requirements of the medical domain. To analyze the vulnerability and robustness of current automatic diagnosis models, we investigate in the generation of adversarial text examples. The main challenges for generating adversarial text examples are divided into three parts. First, the word embedding space is discrete, which makes it hard to perturb as small as adversarial image examples generation. Second, previous adversarial example generation methods focus mainly on multi-class classification models, while automatic diagnosis is a multi-label classification task. Third, the semantic and medical meaning of clinical notes are vital in disease inference, and even small perturbations can change them to a large extent. In this paper, we address the three main challenges and propose Clinical-Attacker, a general framework for both white-box and black-box adversarial text examples generation against automatic diagnosis models. Experimental results on MIMIC-III dataset demonstrate that our framework can easily alter the predictions of automatic diagnosis models with the semantic and medical meaning preserved.

  • Research Article
  • Cite Count Icon 2
  • 10.3390/electronics14153015
Understanding and Detecting Adversarial Examples in IoT Networks: A White-Box Analysis with Autoencoders
  • Jul 29, 2025
  • Electronics
  • Wafi Danesh + 2 more

Novel networking paradigms such as the Internet of Things (IoT) have expanded their usage and deployment to various application domains. Consequently, unseen critical security vulnerabilities such as zero-day attacks have emerged in such deployments. The design of intrusion detection systems for IoT networks is often challenged by a lack of labeled data, which complicates the development of robust defenses against adversarial attacks. As deep learning-based network intrusion detection systems, network intrusion detection systems (NIDS) have been used to counteract emerging security vulnerabilities. However, the deep learning models used in such NIDS are vulnerable to adversarial examples. Adversarial examples are specifically engineered samples tailored to a specific deep learning model; they are developed by minimal perturbation of network packet features, and are intended to cause misclassification. Such examples can bypass NIDS or enable the rejection of regular network traffic. Research in the adversarial example detection domain has yielded several prominent methods; however, most of those methods involve computationally expensive retraining steps and require access to labeled data, which are often lacking in IoT network deployments. In this paper, we propose an unsupervised method for detecting adversarial examples that performs early detection based on the intrinsic characteristics of the deep learning model. Our proposed method requires neither computationally expensive retraining nor extra hardware overhead for implementation. For the work in this paper, we first perform adversarial example generation on a deep learning model using autoencoders. After successful adversarial example generation, we perform adversarial example detection using the intrinsic characteristics of the layers in the deep learning model. A robustness analysis of our approach reveals that an attacker can easily bypass the detection mechanism by using low-magnitude log-normal Gaussian noise. Furthermore, we also test the robustness of our detection method against further compromise by the attacker. We tested our approach on the Kitsune datasets, which are state-of-the-art datasets obtained from deployed IoT network scenarios. Our experimental results show an average adversarial example generation time of 0.337 s and an average detection rate of almost 100%. The robustness analysis of our detection method reveals a reduction of almost 100% in adversarial example detection after compromise by the attacker.

  • Research Article
  • Cite Count Icon 1
  • 10.3390/electronics14071274
An Adversarial Example Generation Algorithm Based on DE-C&W
  • Mar 24, 2025
  • Electronics
  • Ran Zhang + 2 more

Security issues surrounding deep learning models weaken their application effectiveness in various fields. Studying attacks against deep learning models contributes to evaluating their security and improving it in a targeted manner. Among the methods used for this purpose, adversarial example generation methods for deep learning models have become a hot topic in academic research. To overcome problems such as extensive network access, high attack costs, and limited universality in generating adversarial examples, this paper proposes a generic algorithm for adversarial example generation based on improved DE-C&W. The algorithm employs an improved differential evolution (DE) algorithm to conduct a global search of the original examples, searching for vulnerable sensitive points susceptible to being attacked. Then, random perturbations are added to these sensitive points to obtain adversarial examples, which are used as the initial input of C&W attack. The loss functions of the C&W attack algorithm are constructed based on these initial input examples, and the loss function is further optimized using the Adaptive Moment Estimation (Adam) algorithm to obtain the optimal perturbation vector. The experimental results demonstrate that the algorithm not only ensures that the generated adversarial examples achieve a higher success rate of attacks, but also exhibits better transferability while reducing the average number of queries and lowering attack costs.

  • Research Article
  • Cite Count Icon 5
  • 10.1155/2022/9962972
Restricted-Area Adversarial Example Attack for Image Captioning Model
  • Jul 7, 2022
  • Wireless Communications and Mobile Computing
  • Hyun Kwon + 1 more

Deep neural networks provide good performance in the fields of image recognition, speech recognition, and text recognition. For example, recurrent neural networks are used by image captioning models to generate text after an image recognition step, thereby providing captions for the images. The image captioning model first extracts features from the image and generates a representation vector; it then generates the text for the image captions by using the recursive neural network. This model has a weakness, however: it is vulnerable to adversarial examples. In this paper, we propose a method for generating restricted adversarial examples that target image captioning models. By adding a minimal amount of noise just to a specific area of an original sample image, the proposed method creates an adversarial example that remains correctly recognizable to humans yet is misinterpreted by the target model. We evaluated the method’s performance through experiments with the MS COCO dataset and using TensorFlow as the machine learning library. The results show that the proposed method generates a restricted adversarial example that is misinterpreted by the target model while minimizing its distortion from the original sample.

  • Research Article
  • Cite Count Icon 3
  • 10.3390/s22103826
Clustering Approach for Detecting Multiple Types of Adversarial Examples
  • May 18, 2022
  • Sensors (Basel, Switzerland)
  • Seok-Hwan Choi + 3 more

With intentional feature perturbations to a deep learning model, the adversary generates an adversarial example to deceive the deep learning model. As an adversarial example has recently been considered in the most severe problem of deep learning technology, its defense methods have been actively studied. Such effective defense methods against adversarial examples are categorized into one of the three architectures: (1) model retraining architecture; (2) input transformation architecture; and (3) adversarial example detection architecture. Especially, defense methods using adversarial example detection architecture have been actively studied. This is because defense methods using adversarial example detection architecture do not make wrong decisions for the legitimate input data while others do. In this paper, we note that current defense methods using adversarial example detection architecture can classify the input data into only either a legitimate one or an adversarial one. That is, the current defense methods using adversarial example detection architecture can only detect the adversarial examples and cannot classify the input data into multiple classes of data, i.e., legitimate input data and various types of adversarial examples. To classify the input data into multiple classes of data while increasing the accuracy of the clustering model, we propose an advanced defense method using adversarial example detection architecture, which extracts the key features from the input data and feeds the extracted features into a clustering model. From the experimental results under various application datasets, we show that the proposed method can detect the adversarial examples while classifying the types of adversarial examples. We also show that the accuracy of the proposed method outperforms the accuracy of recent defense methods using adversarial example detection architecture.

  • Research Article
  • Cite Count Icon 2
  • 10.32604/iasc.2022.021296
Restoration of Adversarial Examples Using Image Arithmetic Operations
  • Jan 1, 2022
  • Intelligent Automation & Soft Computing
  • Kazim Ali + 1 more

The current development of artificial intelligence is largely based on deep Neural Networks (DNNs). Especially in the computer vision field, DNNs now occur in everything from autonomous vehicles to safety control systems. Convolutional Neural Network (CNN) is based on DNNs mostly used in different computer vision applications, especially for image classification and object detection. The CNN model takes the photos as input and, after training, assigns it a suitable class after setting traceable parameters like weights and biases. CNN is derived from Human Brain's Part Visual Cortex and sometimes performs even better than Haman visual system. However, recent research shows that CNN Models are much vulnerable against adversarial examples. Adversarial examples are input image huts that are deliberately modified, which are imperceptible to humans, but a CNN model strongly misrepresents them. This means that adversarial attacks or examples are a serious threat to deep learning models, especially for CNNs in the computer vision field. The methods which are used to create adversarial examples are called adversarial attacks. We have proposed an easy method that restores adversarial examples, which are created due to different adversarial attacks and misclassified by a CNN model. Our reconstructed adversarial examples are correctly classified by a model again with high probability and restore the prediction of a CNN model. We will also prove that our method is based on image arithmetic operations, simple, single-step, and has low computational complexity. Our method is to reconstruct all types of adversarial examples for correct classification. Therefore, we can say that our proposed method is universal or transferable. The datasets used for experimental evidence are MNIST, FASHION-MNIST, CIFAR10, and CALTECH-101. In the end, we have presented a comparative analysis with other state-of-the methods and proved that our results are better.

  • Research Article
  • Cite Count Icon 69
  • 10.1109/tifs.2019.2925452
Selective Audio Adversarial Example in Evasion Attack on Speech Recognition System
  • Jul 12, 2019
  • IEEE Transactions on Information Forensics and Security
  • Hyun Kwon + 3 more

Deep neural networks (DNNs) are widely used for image recognition, speech recognition, and other pattern analysis tasks. Despite the success of DNNs, these systems can be exploited by what is termed adversarial examples. An adversarial example, in which a small distortion is added to the input data, can be designed to be misclassified by the DNN while remaining undetected by humans or other systems. Such adversarial examples have been studied mainly in the image domain. Recently, however, studies on adversarial examples have been expanding into the voice domain. For example, when an adversarial example is applied to enemy wiretapping devices (victim classifiers) in a military environment, the enemy device will misinterpret the intended message. In such scenarios, it is necessary that friendly wiretapping devices (protected classifiers) should not be deceived. Therefore, the selective adversarial example concept can be useful in mixed situations, defined as situations in which there is both a classifier to be protected and a classifier to be attacked. In this paper, we propose a selective audio adversarial example with minimum distortion that will be misclassified as the target phrase by a victim classifier but correctly classified as the original phrase by a protected classifier. To generate such examples, a transformation is carried out to minimize the probability of incorrect classification by the protected classifier and that of correct classification by the victim classifier. We conducted experiments targeting the state-of-the-art DeepSpeech voice recognition model using Mozilla Common Voice datasets and the Tensorflow library. They showed that the proposed method can generate a selective audio adversarial example with a 91.67% attack success rate and 85.67% protected classifier accuracy.

  • Research Article
  • Cite Count Icon 27
  • 10.1016/j.ins.2020.12.013
Target attack on biomedical image segmentation model based on multi-scale gradients
  • Dec 17, 2020
  • Information Sciences
  • Mingwen Shao + 3 more

Target attack on biomedical image segmentation model based on multi-scale gradients

  • Research Article
  • 10.55041/ijsrem27770
Synthesis of Vision and Language: Multifaceted Image Captioning Application
  • Dec 23, 2023
  • INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
  • Arpit Gupta + 2 more

The rapid advancement in image captioning has been a pivotal area of research, aiming to mimic human-like understanding of visual content. This paper presents an innovative approach that integrates attention mechanisms and object features into an image captioning model. Leveraging the Flickr8k dataset, this research explores the fusion of these components to enhance image comprehension and caption generation. Furthermore, the study showcases the implementation of this model in a user-friendly application using FASTAPI and ReactJS, offering text-to-speech translation in multiple languages. The findings underscore the efficacy of this approach in advancing image captioning technology. This tutorial outlines the construction of an image caption generator, employing Convolutional Neural Network (CNN) for image feature extraction and Long Short-Term Memory Network (LSTM) for Natural Language Processing (NLP). Keywords—Convolutional Neural Networks, Long Short Term Memory, Attention Mechanism, Transformer Architecture, Vision Transformers, Transfer Learning, Multimodal fusion, Deep Learning Models, Pre-Trained Models, Image Processing Techniques

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/globecom46510.2021.9685442
Mal-LSGAN: An Effective Adversarial Malware Example Generation Model
  • Dec 1, 2021
  • Jianhua Wang + 5 more

Various Machine Learning (ML) models have been developed for malware detection. But their widespread application is challenged by adversarial attacks using adversarial malware examples. Generative Adversarial Networks (GAN) is one of the effective approaches to help build possible unknown attacks and expose the vulnerability of targeted systems. The existing GAN-based ML models have the weaknesses of unstable training and low-quality adversarial examples. In this paper, we propose a novel Mal-LSGAN model to tackle these weaknesses. By using a Least Square (LS) loss function and new activation function combinations, Mal-LSGAN achieves a higher Attack Success Rate (ASR) and a lower True Positive Rate (TPR) in 6 ML detectors, compared with the existing MalGAN and Imp-MalGAN. In Multi-Layer Perceptron (MLP), Mal-LSGAN can even decrease TPR from 97.81% of original examples to 2.92% of adversarial examples. The experimental results also demonstrate that Mal-Lsgangets the preferable transferability of adversarial malware examples.

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.patrec.2020.04.034
Perturbation analysis of gradient-based adversarial attacks
  • May 1, 2020
  • Pattern Recognition Letters
  • Utku Ozbulak + 3 more

Perturbation analysis of gradient-based adversarial attacks

  • Book Chapter
  • 10.1007/978-3-031-16815-4_4
Towards Interpreting Vulnerability of Object Detection Models via Adversarial Distillation
  • Jan 1, 2022
  • Yaoyuan Zhang + 6 more

Recent works have shown that deep learning models are highly vulnerable to adversarial examples, limiting the application of deep learning in security-critical systems. This paper aims to interpret the vulnerability of deep learning models to adversarial examples. We propose adversarial distillation to illustrate that adversarial examples are generalizable data features. Deep learning models are vulnerable to adversarial examples because models do not learn this data distribution. More specifically, we obtain adversarial features by introducing a generation and extraction mechanism. The generation mechanism generates adversarial examples, which mislead the source model trained on the original clean samples. The extraction term removes the original features and selects valid and generalizable adversarial features. Valuable adversarial features guide the model to learn the data distribution of adversarial examples and realize the model’s generalization on the adversarial dataset. Extensive experimental evaluations have proved the excellent generalization performance of the adversarial distillation model. Compared with the normally trained model, the mAP has increased by 2.17% on their respective test sets, while the mAP on the opponent’s test set is very low. The experimental results further prove that adversarial examples are also generalizable data features, which obeys a different data distribution from the clean data. Understanding why deep learning models are not robust to adversarial samples is helpful to attain interpretable and robust deep learning models. Robust models are essential for users to trust models and interact with the models, which can promote the application of deep learning in security-sensitive systems.KeywordsAdversarial examplesInterpretabilityObject detectionDeep learning

Save Icon
Up Arrow
Open/Close