Back to Search Start Over

Open-Vocabulary Predictive World Models from Sensor Observations.

Authors :
Karlsson R
Asfandiyarov R
Carballo A
Fujii K
Ohtani K
Takeda K
Source :
Sensors (Basel, Switzerland) [Sensors (Basel)] 2024 Jul 21; Vol. 24 (14). Date of Electronic Publication: 2024 Jul 21.
Publication Year :
2024

Abstract

Cognitive scientists believe that adaptable intelligent agents like humans perform spatial reasoning tasks by learned causal mental simulation. The problem of learning these simulations is called predictive world modeling. We present the first framework for a learning open-vocabulary predictive world model (OV-PWM) from sensor observations. The model is implemented through a hierarchical variational autoencoder (HVAE) capable of predicting diverse and accurate fully observed environments from accumulated partial observations. We show that the OV-PWM can model high-dimensional embedding maps of latent compositional embeddings representing sets of overlapping semantics inferable by sufficient similarity inference. The OV-PWM simplifies the prior two-stage closed-set PWM approach to the single-stage end-to-end learning method. CARLA simulator experiments show that the OV-PWM can learn compact latent representations and generate diverse and accurate worlds with fine details like road markings, achieving 69 mIoU over six query semantics on an urban evaluation sequence. We propose the OV-PWM as a versatile continual learning paradigm for providing spatio-semantic memory and learned internal simulation capabilities to future general-purpose mobile robots.

Details

Language :
English
ISSN :
1424-8220
Volume :
24
Issue :
14
Database :
MEDLINE
Journal :
Sensors (Basel, Switzerland)
Publication Type :
Academic Journal
Accession number :
39066133
Full Text :
https://doi.org/10.3390/s24144735