Back to Search
Start Over
Emotion-sensitive deep dyna-Q learning for task-completion dialogue policy learning.
- Source :
-
Neurocomputing . Oct2021, Vol. 459, p122-130. 9p. - Publication Year :
- 2021
-
Abstract
- In recent years, task-oriented dialogue systems have received extensive attention from academia and industry. Training a dialogue agent through reinforcement learning is often costly because it requires many interactions with real users. Although the Deep Dyna-Q (DDQ) framework uses simulation experience to alleviate the cost of direct reinforcement learning, it still suffers from challenges such as delayed rewards and policy degradation. This paper proposes an Emotion-Sensitive Deep Dyna-Q (ES-DDQ) model which: (1) presents an emotional world model that considers emotion-related cues to improve the ability of the traditional DDQ framework to model and simulate users, and (2) designs two kinds of emotion-related immediate rewards to mitigate the delayed reward problem. Experimental results show that our proposed approach effectively simulates users' behaviors and is superior to the state-of-the-art benchmarks. [ABSTRACT FROM AUTHOR]
- Subjects :
- *DEEP learning
*REWARD (Psychology)
*REINFORCEMENT learning
*DIRECT costing
Subjects
Details
- Language :
- English
- ISSN :
- 09252312
- Volume :
- 459
- Database :
- Academic Search Index
- Journal :
- Neurocomputing
- Publication Type :
- Academic Journal
- Accession number :
- 152347555
- Full Text :
- https://doi.org/10.1016/j.neucom.2021.06.075