Abstract

Despite their many successful applications, Deep Neural Networks (DNNs) are vulnerable to intentionally designed adversarial examples. Adversarial robustness describes the ability of a machine learning model, e.g., a neural network, to defend against such adversarial attacks. In coding theory, codebooks are designed to minimize the impact of errors occurring with transmission through a noisy channel. Motivated by the similarities between passing a codeword through a noisy channel and defending against adversarial attacks, Error-Correcting Output Codes (ECOCs) are used to achieve state-of-the-art adversarial robustness. Research on codebook designs and the association of codewords to classification labels (assignment) is still at the very early stages, with great room for improvement. In this work, we present novel codebook design and assignment procedures in two stages due to the complexity (NP-hardness) of the underlying problem. A rule-based heuristic codebook design method is proposed in the first stage and an optimization problem to assign the codewords to labels is proposed in the second stage. Since this optimization is NP-hard, a greedy algorithm is proposed to provide a sub-optimal solution. We demonstrate the effectiveness of our framework on three benchmark datasets, under different types of adversarial attacks. The experimental results show that our error-correcting output code framework can effectively improve the adversarial robustness of machine learning models, with up to a 10% increase in accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call