Kolmogorov–Arnold Networks (KANs) are a novel class of neural network architectures based on the Kolmogorov–Arnold representation theorem, which has demonstrated potential advantages in accuracy and interpretability over Multilayer Perceptron (MLP) models. This paper comprehensively evaluates the robustness of various KAN architectures—including KAN, KAN-Mixer, KANConv_KAN, and KANConv_MLP—against adversarial attacks, which constitute a critical aspect that has been underexplored in current research. We compare these models with MLP-based architectures such as MLP, MLP-Mixer, and ConvNet_MLP across three traffic sign classification datasets: GTSRB, BTSD, and CTSD. The models were subjected to various adversarial attacks (FGSM, PGD, CW, and BIM) with varying perturbation levels and were trained under different strategies, including standard training, adversarial training, and Randomized Smoothing. Our experimental results demonstrate that KAN-based models, particularly the KAN-Mixer, exhibit superior robustness to adversarial attacks compared to their MLP counterparts. Specifically, the KAN-Mixer consistently achieved lower Success Attack Rates (SARs) and Degrees of Change (DoCs) across most attack types and datasets while maintaining high accuracy on clean data. For instance, under FGSM attacks with ϵ=0.01, the KAN-Mixer outperformed the MLP-Mixer by maintaining higher accuracy and lower SARs. Adversarial training and Randomized Smoothing further enhanced the robustness of KAN-based models, with t-SNE visualizations revealing more stable latent space representations under adversarial perturbations. These findings underscore the potential of KAN architectures to improve neural network security and reliability in adversarial settings.
Read full abstract