1. SS4CTR: a semi-supervised framework for enhancing click-through rate prediction in sparse and imbalanced data.
- Author
-
Zhou, Junming, Chang, Chao, Li, Weisheng, Lin, Ronghua, Wu, Zhengyang, and Tang, Yong
- Abstract
Click-Through Rate (CTR) prediction, which estimates the probability of a user clicking on a particular item, constitutes a pivotal element in the realms of both online advertising and recommender systems. However, issues surrounding sparse and imbalanced data have yet to be resolved. To cope with these challenges, this paper proposes a semi-supervised framework called SS4CTR. Two distinctive features characterise the proposed SS4CTR model. Firstly, it employs an interpretable approach to select negative samples based on the global popularity of items, ensuring a balanced ratio of positive and negative samples within the input dataset. Secondly, by integrating both labeled and unlabeled data into the training process, the model effectively tackles the challenge of data sparsity and significantly enhances the accuracy of user click-through rate predictions. And the confidence threshold mechanism for pseudo-labelling also ensures that unlabeled data can be used in a secure manner. To the best of our knowledge, this is the first study to address the key challenges posed by sparse and imbalanced data simultaneously in the context of CTR prediction. Extensive experiments conducted on four real-world sparse datasets confirm the effectiveness and applicability of the SS4CTR model in scenarios characterized by sparse and imbalanced data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF