1. Uncovering Distortion Differences: A Study of Adversarial Attacks and Machine Discriminability
- Author
-
Xiawei Wang, Yao Li, Cho-Jui Hsieh, and Thomas C. M. Lee
- Subjects
Decision-based attacks ,deep neural networks ,gradient-based attacks ,image classification ,score-based attacks ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Deep neural networks have performed remarkably in many areas, including image-related classification tasks. However, various studies have shown that they are vulnerable to adversarial examples – images carefully crafted to fool well-trained deep neural networks by introducing imperceptible perturbations to the original images. To better understand the inherent characteristics of adversarial attacks, this paper studies the features of three common attack families: gradient-based, score-based, and decision-based. The primary objective is to recognize distinct types of adversarial examples, as identifying the type of information possessed by the attacker can aid in developing effective defense strategies. This paper demonstrates that adversarial images from different attack families can be successfully identified with a simple model. To further investigate the reason behind the observations, this paper conducts carefully designed experiments to study the distortion patterns of different attacks. Experimental results on CIFAR10 and Tiny ImageNet validated the differences in distortion patterns between various attack types for both $L_{2}$ and $L_{\infty } $ norm.
- Published
- 2024
- Full Text
- View/download PDF