Abstract
Despite the widespread applications of neural networks, increasing concerns arise regarding their reliability and security. This paper introduces a novel repair approach integrating counterfactual causality with a multi-objective algorithm to mitigate critical security property issues, including backdoor attacks, safety violations, and unfairness. Firstly, we abstract the neural network into a counterfactual structure causal model, utilizing a causal trace to allocate contributions to each hidden layer neuron for output prediction. Secondly, ”faulty” neurons are identified through fault location, utilizing counterfactual causality between positive and negative example-guided attributes and neuronal contribution. Finally, an ablation study determines the number of neurons for repair. The multi-objective algorithm NSGA-III is then applied to optimize the faulty neurons, adjusting weight parameters to rectify erroneous behavior and eliminate vulnerabilities while maximizing the model’s accuracy. This method achieves notable Pareto optimization and effectively eliminates vulnerabilities in neural networks. In comparative experiments against CARE (CAusality-based REpair) techniques, our Counterfactual Causality-Based Repair (CCBR) method significantly reduces backdoor attack success rates and security violations to less than 1%, while maintaining model accuracy and ensuring security for normal inputs. In fairness repair tasks, CCBR improves fairness by 92.56%, a substantial 19.56% increase over CARE. These results affirm the robustness and practicality of CCBR, establishing it as a valuable tool for enhancing neural network safety and fairness.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.