1. Type-I Generative Adversarial Attack
- Author
-
He, Shenghong, Wang, Ruxin, Liu, Tongliang, Yi, Chao, Jin, Xin, Liu, Renyang, and Zhou, Wei
- Abstract
Deep neural networks are vulnerable to adversarial attacks either by examples with indistinguishable perturbations which produce incorrect predictions, or by examples with noticeable transformations that are still predicted as the original label. The latter case is known as the Type I attack which, however, has achieved limited attention in literature. We advocate that the vulnerability comes from the ambiguous distributions among different classes in the resultant feature space of the model, which is saying that the examples with different appearances may present similar features. Inspired by this, we propose a novel Type I attack method called generative adversarial attack (GAA). Specifically, GAA aims at exploiting the distribution mapping from the source domain of multiple classes to the target domain of a single class by using generative adversarial networks. A novel loss and a U-net architecture with latent modification are elaborated to ensure the stable transformation between the two domains. In this way, the generated adversarial examples have similar appearances with examples of the target domain, yet obtaining the original prediction by the model being attacked. Extensive experiments on multiple benchmarks demonstrate that the proposed method generates adversarial images that are more visually similar to the target images than the competitors, and the state-of-the-art performance is achieved.
- Published
- 2023
- Full Text
- View/download PDF