1. Reinforcement Learning for Production‐Based Cognitive Models
- Author
-
Brasoveanu, Adrian, Dotlacil, Jakub, LS OZ Lexion en syntaxis, and UiL OTS LLI
- Subjects
Linguistics and Language ,Process (engineering) ,Computer science ,Cognitive Neuroscience ,Q-learning ,Experimental and Cognitive Psychology ,Hand coding ,050105 experimental psychology ,03 medical and health sciences ,Cognition ,0302 clinical medicine ,Artificial Intelligence ,Reinforcement learning ,Lexical decision task ,Humans ,0501 psychology and cognitive sciences ,Soar ,Sequential decision processes ,business.industry ,Learnability ,Learnability of sequential decision processes ,05 social sciences ,ACT-R ,Human-Computer Interaction ,Production-based cognitive models ,Artificial intelligence ,business ,Reinforcement, Psychology ,Algorithms ,030217 neurology & neurosurgery - Abstract
Production-based cognitive models, such as Adaptive Control of Thought-Rational (ACT-R) or Soar agents, have been a popular tool in cognitive science to model sequential decision processes. While the models have been useful in articulating assumptions and predictions of various theories, they unfortunately require a significant amount of hand coding, both with respect to what building blocks cognitive processes should consist of and with respect to how these building blocks are selected and ordered in a sequential decision process. Hand coding of large, realistic models poses a challenge for modelers, and also makes it unclear whether the models can be learned and are thus cognitively plausible. The learnability issue is probably most starkly present in cognitive models of linguistic skills, since linguistic skills involve richly structured representations and highly complex rules. We investigate how reinforcement learning (RL) methods can be used to solve the production selection and production ordering problem in ACT-R. We focus on four algorithms from the (Formula presented.) learning family, tabular (Formula presented.) and three versions of deep (Formula presented.) networks (DQNs), as well as the ACT-R utility learning algorithm, which provides a baseline for the (Formula presented.) algorithms. We compare the performance of these five algorithms in a range of lexical decision (LD) tasks framed as sequential decision problems. We observe that, unlike the ACT-R baseline, the (Formula presented.) agents learn even the more complex LD tasks fairly well. However, tabular (Formula presented.) and DQNs show a trade-off between speed of learning, applicability to more complex tasks, and how noisy the learned rules are. This indicates that the ACT-R subsymbolic system for procedural memory could be improved by incorporating more insights from RL approaches, particularly the function-approximation-based ones, which learn and generalize effectively in complex, more realistic tasks.
- Published
- 2021
- Full Text
- View/download PDF