1. Planning in a recurrent neural network that plays Sokoban
- Author
-
Taufeeque, Mohammad, Quirke, Philip, Li, Maximilian, Cundy, Chris, Tucker, Aaron David, Gleave, Adam, and Garriga-Alonso, Adrià
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
How a neural network (NN) generalizes to novel situations depends on whether it has learned to select actions heuristically or via a planning process. "An investigation of model-free planning" (Guez et al. 2019) found that a recurrent NN (RNN) trained to play Sokoban appears to plan, with extra computation steps improving the RNN's success rate. We replicate and expand on their behavioral analysis, finding the RNN learns to give itself extra computation steps in complex situations by "pacing" in cycles. Moreover, we train linear probes that predict the future actions taken by the network and find that intervening on the hidden state using these probes controls the agent's subsequent actions. Leveraging these insights, we perform model surgery, enabling the convolutional NN to generalize beyond its 10x10 architectural limit to arbitrarily sized inputs. The resulting model solves challenging, highly off-distribution levels. We open-source our model and code, and believe the neural network's small size (1.29M parameters) makes it an excellent model organism to deepen our understanding of learned planning., Comment: Mechanistic Interpretability workshop, ICML 2024
- Published
- 2024