1. Improving the transferability of adversarial examples with path tuning.
- Author
-
Li, Tianyu, Li, Xiaoyu, Ke, Wuping, Tian, Xuwei, Zheng, Desheng, and Lu, Chao
- Subjects
ARTIFICIAL neural networks ,IRREGULAR sampling (Signal processing) ,COMPUTER vision ,CYBERTERRORISM ,ARTIFICIAL intelligence - Abstract
Adversarial attacks pose a significant threat to real-world applications based on deep neural networks (DNNs), especially in security-critical applications. Research has shown that adversarial examples (AEs) generated on a surrogate model can also succeed on a target model, which is known as transferability. Feature-level transfer-based attacks improve the transferability of AEs by disrupting intermediate features. They target the intermediate layer of the model and use feature importance metrics to find these features. However, current methods overfit feature importance metrics to surrogate models, which results in poor sharing of the importance metrics across models and insufficient destruction of deep features. This work demonstrates the trade-off between feature importance metrics and feature corruption generalization, and categorizes feature destructive causes of misclassification. This work proposes a generative framework named PTNAA to guide the destruction of deep features across models, thus improving the transferability of AEs. Specifically, the method introduces path methods into integrated gradients. It selects path functions using only a priori knowledge and approximates neuron attribution using nonuniform sampling. In addition, it measures neurons based on the attribution results and performs feature-level attacks to remove inherent features of the image. Extensive experiments demonstrate the effectiveness of the proposed method. The code is available at https://github.com/lounwb/PTNAA. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF