[1]LI Xuexiang,GAO Yafei,XIA Huili,et al.Backdoor Removal Method for Deep Neural Networks Based on Pruning and Backdoor Unlearning[J].Journal of Zhengzhou University (Engineering Science),2026,47(XX):1-8.[doi:10.13705/j.issn.1671-6833.2025.05.018]
Copy
Journal of Zhengzhou University (Engineering Science)[ISSN
1671-6833/CN
41-1339/T] Volume:
47
Number of periods:
2026 XX
Page number:
1-8
Column:
Public date:
2026-09-10
- Title:
-
Backdoor Removal Method for Deep Neural Networks Based on Pruning and Backdoor Unlearning
- Author(s):
-
LI Xuexiang1 ; GAO Yafei1 ; XIA Huili2 ; WANG Chao1 ; LIU Minglin1
-
1. School of Cyber Science and Engineering, Zhengzhou University, Zheng zhou 450002,China; 2. Henan Multimodal Perception and Intelligent Interaction Technology Engineering Research Center, Zhengzhou 451191,China
-
- Keywords:
-
deep neural networks; backdoor attack; backdoor defense; pre-activation distribution; adversarial backdoor unlearning
- CLC:
-
TP309 TP181
- DOI:
-
10.13705/j.issn.1671-6833.2025.05.018
- Abstract:
-
Backdoor attacks pose a serious threat to the security of deep neural networks. Most existing backdoor defense methods rely on partial original training data to remove backdoor from models. However, in real-world scenarios where these data access is limited, these methods perform poorly in eliminating backdoor and often significantly impact the model’s original accuracy. To address these issues, this paper proposes a data-free backdoor removal method based on pruning and backdoor unlearning (DBR-PU). Specifically, the proposed method first analyzes the pre-activation distribution differences of model neurons on a synthetic dataset to identify suspicious neurons. Then, it reduces the impact of backdoor by pruning these suspicious neurons. Finally, an adversarial backdoor unlearning strategy is employed to further eliminate the model’s internal response to any residual backdoor information. Extensive experiments on the CIFAR10 and GTSRB datasets against six mainstream backdoor attack methods demonstrate that, under data access constraints, the proposed method achieves a minimal accuracy gap compared to the best baseline defense methods and performs the best in reducing attack success rates, outperforming the best baseline defense method by 2.37% and 1.3%, respectively.