Back to Search
Start Over
Future Impact Decomposition in Request-level Recommendations
- Publication Year :
- 2024
-
Abstract
- In recommender systems, reinforcement learning solutions have shown promising results in optimizing the interaction sequence between users and the system over the long-term performance. For practical reasons, the policy's actions are typically designed as recommending a list of items to handle users' frequent and continuous browsing requests more efficiently. In this list-wise recommendation scenario, the user state is updated upon every request in the corresponding MDP formulation. However, this request-level formulation is essentially inconsistent with the user's item-level behavior. In this study, we demonstrate that an item-level optimization approach can better utilize item characteristics and optimize the policy's performance even under the request-level MDP. We support this claim by comparing the performance of standard request-level methods with the proposed item-level actor-critic framework in both simulation and online experiments. Furthermore, we show that a reward-based future decomposition strategy can better express the item-wise future impact and improve the recommendation accuracy in the long term. To achieve a more thorough understanding of the decomposition strategy, we propose a model-based re-weighting framework with adversarial learning that further boost the performance and investigate its correlation with the reward-based strategy.<br />Comment: 12 pages, 8 figures
- Subjects :
- Computer Science - Information Retrieval
H.3.3
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2401.16108
- Document Type :
- Working Paper
- Full Text :
- https://doi.org/10.1145/3637528.3671506