Back to Search Start Over

UCTT: universal and low-cost adversarial example generation for tendency classification.

Authors :
Zhang, Yunting
Ye, Lin
Tian, Zeshu
Chen, Zhe
Zhang, Hongli
Li, Baisong
Fang, Binxing
Source :
Neural Computing & Applications. Aug2024, Vol. 36 Issue 22, p13865-13894. 30p.
Publication Year :
2024

Abstract

The adversary makes malicious samples capable of triggering erroneous judgments in deep learning models by introducing imperceptible perturbations to the original benign texts. These malicious samples are referred to as adversarial texts. The exploration of adversarial text generation methods not only facilitates our understanding of the robustness of mainstream deep neural networks against such adversarial attacks but also aids in developing appropriate defensive strategies. Nevertheless, the mainstream research on textual adversarial attacks has mainly focused on attack effectiveness, overlooking the associated attack cost. For real-world attacks, considerations such as the time cost, material cost, manpower cost, and various constraints are also crucial. In this paper, we propose a low-cost adversarial text generation method based on the universal strategy in the black-box attack scenario, Universal Chinese Text Tricker (UCTT), for tendency classification on Chinese texts. UCTT is both text-independent and model-independent, which markedly reduces its attack cost. Instead of crafting adversarial texts for a specific text, UCTT generates universal perturbations based on a universal word substitution list, which is applicable to any data in tendency classification datasets. Given a perturbation rate, we can use the word list to craft adversarial texts by simple substitutions without accessing the target model. In the framework of adversarial text generation based on word importance, UCTT utilizes count, arithmetic progression, linear normalization, and nonlinear normalization to calculate the scores of the important words in the dataset and then computes the candidate word frequencies, which in turn constructs the universal word substitution list. Compared with other black-box methods, the experimental results on real-world tendency classification datasets show that UCTT exhibits an effective attack capability while significantly reducing the attack cost. Compared to the powerful baseline we designed that exceeds the SOTA, UCTT improves the efficiency of adversarial text generation by up to a factor of 7 without accessing the target model. In addition to demonstrating excellent attack performance on mainstream models, UCTT is also capable of attacking the powerful ChatGPT in the physical world, which cannot be directly attacked by traditional adversarial text generation methods due to the hard labels produced by the target model. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09410643
Volume :
36
Issue :
22
Database :
Academic Search Index
Journal :
Neural Computing & Applications
Publication Type :
Academic Journal
Accession number :
178954619
Full Text :
https://doi.org/10.1007/s00521-024-09760-5