1. Enhancement of Hippocampal Spatial Decoding Using a Dynamic Q-Learning Method With a Relative Reward Using Theta Phase Precession
- Author
-
Boyi Qu, Bo Wei Chen, Chen Yang Hsu, Shih Hung Yang, Yu Chun Lo, Sheng Wei Lee, Hsin Yi Lai, Han Lin Wang, Chao Hung Kuo, You Yin Chen, Jung Chen Chen, Sheng Huang Lin, Xiao Yu, Ching Fu Wang, Yun Ting Kuo, and Han Chi Pan
- Subjects
Mean squared error ,Computer Networks and Communications ,Q-learning ,03 medical and health sciences ,0302 clinical medicine ,Reward ,Interneurons ,Convergence (routing) ,Animals ,Reinforcement learning ,Theta Rhythm ,CA1 Region, Hippocampal ,030304 developmental biology ,Mathematics ,0303 health sciences ,Behavior, Animal ,Artificial neural network ,Electroencephalography ,General Medicine ,Function (mathematics) ,Models, Theoretical ,Rats ,Place Cells ,Rate of convergence ,Goals ,Algorithm ,Algorithms ,030217 neurology & neurosurgery ,Decoding methods ,Spatial Navigation - Abstract
Hippocampal place cells and interneurons in mammals have stable place fields and theta phase precession profiles that encode spatial environmental information. Hippocampal CA1 neurons can represent the animal’s location and prospective information about the goal location. Reinforcement learning (RL) algorithms such as Q-learning have been used to build the navigation models. However, the traditional Q-learning ([Formula: see text]Q-learning) limits the reward function once the animals arrive at the goal location, leading to unsatisfactory location accuracy and convergence rates. Therefore, we proposed a revised version of the Q-learning algorithm, dynamical Q-learning ([Formula: see text]Q-learning), which assigns the reward function adaptively to improve the decoding performance. Firing rate was the input of the neural network of [Formula: see text]Q-learning and was used to predict the movement direction. On the other hand, phase precession was the input of the reward function to update the weights of [Formula: see text]Q-learning. Trajectory predictions using [Formula: see text]Q- and [Formula: see text]Q-learning were compared by the root mean squared error (RMSE) between the actual and predicted rat trajectories. Using [Formula: see text]Q-learning, significantly higher prediction accuracy and faster convergence rate were obtained compared with [Formula: see text]Q-learning in all cell types. Moreover, combining place cells and interneurons with theta phase precession improved the convergence rate and prediction accuracy. The proposed [Formula: see text]Q-learning algorithm is a quick and more accurate method to perform trajectory reconstruction and prediction.
- Published
- 2020