Abstract

An adversarial example is a specially-crafted example with subtle and intentional perturbations that causes a machine learning model to make a false classification. A plethora of papers have proposed to use filters to effectively defend against adversarial example attacks. However, we demonstrate that the filter-based defenses may not be reliable in this paper. We develop AEDescaptor, a scheme to escape the filter-based defenses. AEDescaptor uses a specially -crafted policy gradient reinforcement learning algorithm to generate adversarial exam-ples even if the filters are used to interrupt the backpropagation channel (that is used in traditional adversarial example attack algorithms). Furthermore, we design a customized algorithm to reduce the possible action space in policy gradient reinforcement learning to accelerate AEDescaptor training while still ensuring that AEDescaptor generates successful adversarial examples. The intensive experiments demonstrate that AEDescaptor-generated adversarial examples have good performance (in terms of success rate and transferability) to escape the filter-based defenses.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.