Abstract
Backdoor attack to deep neural networks (DNNs) is among the predominant approaches to bring great threats into artificial intelligence. The existing methods to detect backdoor attacks focus on the perspective of distributions in DNNs, however, limited by its ability of generalization across DNN models. In this article, a critical-path-based backdoor detector (CPBD) is proposed, which approaches to detect backdoor attacks via DNN's interpretability. CPBD is designed to efficiently discover the characteristics of backdoors, which distinguish the critical paths in the attacked DNNs. To deal with the intractably large number of neurons, we propose to simplify the neurons, and the preserved key nodes are integrated into a set of critical paths. Thus, a DNN model can be formulated as a combination of several critical paths. Afterward, the detection of backdoors is performed based on the analysis of critical paths corresponding to different classes. Then, combining all the above steps, the CPBD algorithm is integrated to present the results in a standard and systematic manner. In addition, CPBD is able to locate neurons associated with malicious triggers, the combination of which is named as trigger propagation path. Extensive experiments are conducted, which testify the efficiency of the proposed method on multiple DNNs and different trigger sizes.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Neural Networks and Learning Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.