Back to Search Start Over

Reinforcement learning when your life depends on it: A neuro-economic theory of learning.

Authors :
Jiang, Jiamu
Foyardg, Emilie
van Rossumg, Mark C. W.
Source :
PLoS Computational Biology. 10/28/2024, Vol. 20 Issue 10, p1-16. 16p.
Publication Year :
2024

Abstract

Synaptic plasticity enables animals to adapt to their environment, but memory formation can require a substantial amount of metabolic energy, potentially impairing survival. Hence, a neuro-economic dilemma arises whether learning is a profitable investment or not, and the brain must therefore judiciously regulate learning. Indeed, in experiments it was observed that during starvation, Drosophila suppress formation of energy-intensive aversive memories. Here we include energy considerations in a reinforcement learning framework. Simulated flies learned to avoid noxious stimuli through synaptic plasticity in either the energy expensive long-term memory (LTM) pathway, or the decaying anesthesia-resistant memory (ARM) pathway. The objective of the flies is to maximize their lifespan, which is calculated with a hazard function. We find that strategies that switch between the LTM and ARM pathways, based on energy reserve and reward prediction error, prolong lifespan. Our study highlights the significance of energy-regulation of memory pathways and dopaminergic control for adaptive learning and survival. It might also benefit engineering applications of reinforcement learning under resources constraints. Author summary: There is increasing evidence that biological learning and in particular the creation of long lasting forms of memory requires substantial amounts of energy. It has been observed that as a result, animals such as drosophila might stop some forms of learning when they are low on energy. In this modelling paper we analyze this learning vs starvation trade-off using a hazard framework with as objective to maximize the lifetime of the animal. We then explore the optimal3 algorithm to balance energy saving with learning. We find that it is best to restrict the learning using expensive persistent memory to situations where the animal's energy reserve is high, and there is also a large deviation between expected and actual reward. We speculate that there is evidence for similar energy adaptive mechanism in mammalian learning. The findings might also be relevant for human behavior and artificial systems with resource limitations such as limited battery life. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
20
Issue :
10
Database :
Academic Search Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
180522144
Full Text :
https://doi.org/10.1371/journal.pcbi.1012554