Back to Search Start Over

ESimCSE Unsupervised Contrastive Learning Jointly with UDA Semi-Supervised Learning for Large Label System Text Classification Mode

Authors :
Lu, Ruan
HangCheng, Zhou
Meng, Ran
Jin, Zhao
JiaoYu, Qin
Feng, Wei
ChenZi, Wang
Publication Year :
2023

Abstract

The challenges faced by text classification with large tag systems in natural language processing tasks include multiple tag systems, uneven data distribution, and high noise. To address these problems, the ESimCSE unsupervised comparative learning and UDA semi-supervised comparative learning models are combined through the use of joint training techniques in the models.The ESimCSE model efficiently learns text vector representations using unlabeled data to achieve better classification results, while UDA is trained using unlabeled data through semi-supervised learning methods to improve the prediction performance of the models and stability, and further improve the generalization ability of the model. In addition, adversarial training techniques FGM and PGD are used in the model training process to improve the robustness and reliability of the model. The experimental results show that there is an 8% and 10% accuracy improvement relative to Baseline on the public dataset Ruesters as well as on the operational dataset, respectively, and a 15% improvement in manual validation accuracy can be achieved on the operational dataset, indicating that the method is effective.<br />Comment: This paper contains 14 pages,4 figures,4 tables

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2304.13140
Document Type :
Working Paper