Back to Search
Start Over
Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture.
- Source :
- IEEE Transactions on Neural Networks & Learning Systems; May2022, Vol. 33 Issue 5, p2045-2056, 12p
- Publication Year :
- 2022
-
Abstract
- In this article, we consider a subclass of partially observable Markov decision process (POMDP) problems which we termed confounding POMDPs. In these types of POMDPs, temporal difference (TD)-based reinforcement learning (RL) algorithms struggle, as TD error cannot be easily derived from observations. We solve these types of problems using a new bio-inspired neural architecture that combines a modulated Hebbian network (MOHN) with deep Q-network (DQN), which we call modulated Hebbian plus Q-network architecture (MOHQA). The key idea is to use a Hebbian network with rarely correlated bio-inspired neural traces to bridge temporal delays between actions and rewards when confounding observations and sparse rewards result in inaccurate TD errors. In MOHQA, DQN learns low-level features and control, while the MOHN contributes to high-level decisions by associating rewards with past states and actions. Thus, the proposed architecture combines two modules with significantly different learning algorithms, a Hebbian associative network and a classical DQN pipeline, exploiting the advantages of both. Simulations on a set of POMDPs and on the Malmo environment show that the proposed algorithm improved DQN’s results and even outperformed control tests with advantage-actor critic (A2C), quantile regression DQN with long short-term memory (QRDQN + LSTM), Monte Carlo policy gradient (REINFORCE), and aggregated memory for reinforcement learning (AMRL) algorithms on most difficult POMDPs with confounding stimuli and sparse rewards. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 2162237X
- Volume :
- 33
- Issue :
- 5
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Neural Networks & Learning Systems
- Publication Type :
- Periodical
- Accession number :
- 156718289
- Full Text :
- https://doi.org/10.1109/TNNLS.2021.3110281