Back to Search
Start Over
Adversarial perturbation denoising utilizing common characteristics in deep feature space.
- Source :
- Applied Intelligence; Jan2024, Vol. 54 Issue 2, p1672-1690, 19p
- Publication Year :
- 2024
-
Abstract
- Recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples (AEs). Denoising based on the input pre-processing is one of the defenses against adversarial attacks. However, it is hard to remove multiple adversarial perturbations, especially in the presence of evolving attacks. To address this challenge, we attempt to extract the commonality of adversarial perturbations. Due to the imperceptibility of adversarial perturbations in the input space, we conduct the extraction in the deep feature space where the perturbations become more apparent. Through the obtained common characteristics, we craft common adversarial examples (CAEs) to train the denoiser. Furthermore, to prevent image distortion while removing as much of the adversarial perturbation as possible, we propose a hybrid loss function that guides the training process at both the pixel level and the deep feature space. Our experiments show that our defense method can eliminate multiple adversarial perturbations, significantly enhancing adversarial robustness compared to previous state-of-the-art methods. Moreover, it can be plug-and-play for various classification models, which demonstrates the generalizability of our defense method. [ABSTRACT FROM AUTHOR]
- Subjects :
- ARTIFICIAL neural networks
PIXELS
FEATURE extraction
Subjects
Details
- Language :
- English
- ISSN :
- 0924669X
- Volume :
- 54
- Issue :
- 2
- Database :
- Complementary Index
- Journal :
- Applied Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 175530467
- Full Text :
- https://doi.org/10.1007/s10489-023-05253-5