New Findings from University of Alberta Update Understanding of Hallucinations (Mitigating Value Hallucination In Dyna-style Planning Via Multistep Predecessor Models).

Source :: Pain & Central Nervous System Week; 7/12/2024, p646-646, 1p
Publication Year :: 2024
Abstract: Researchers from the University of Alberta have made new findings in the field of perceptual diseases and conditions, specifically hallucinations. They discuss the limitations of Dyna-style reinforcement learning agents, which update the value function with simulated experience generated by an environment model. The researchers propose a new Dyna algorithm that avoids failure by not updating the values of real states towards values of simulated states, which can result in misleading action values. The experimental results support this approach and suggest that using predecessor models with multi-step updates is a promising direction for developing more robust Dyna algorithms. [Extracted from the article]

Tools