1. Incorporating simulated spatial context information improves the effectiveness of contrastive learning models
- Author
-
Lizhen Zhu, James Z. Wang, Wonseuk Lee, and Brad Wyble
- Subjects
contrastive learning ,virtual environment ,developmental psychology ,deep learning ,computer vision ,intelligent agent ,Computer software ,QA76.75-76.765 - Abstract
Summary: Visual learning often occurs in a specific context, where an agent acquires skills through exploration and tracking of its location in a consistent environment. The historical spatial context of the agent provides a similarity signal for self-supervised contrastive learning. We present a unique approach, termed environmental spatial similarity (ESS), that complements existing contrastive learning methods. Using images from simulated, photorealistic environments as an experimental setting, we demonstrate that ESS outperforms traditional instance discrimination approaches. Moreover, sampling additional data from the same environment substantially improves accuracy and provides new augmentations. ESS allows remarkable proficiency in room classification and spatial prediction tasks, especially in unfamiliar environments. This learning paradigm has the potential to enable rapid visual learning in agents operating in new environments with unique visual characteristics. Potentially transformative applications span from robotics to space exploration. Our proof of concept demonstrates improved efficiency over methods that rely on extensive, disconnected datasets. The bigger picture: Despite being trained on extensive datasets, current computer vision systems lag behind human children in learning about the visual world. One possible reason for this discrepancy is the fact that humans actively explore their environment as embodied agents, sampling data from a stable visual world with accompanying context. Bearing some resemblance to human childhood experience, contrastive learning is a machine-learning technique that allows learning of general features without having labeled data. This is done by grouping together similar things or objects and separating those that are dissimilar. Contrastive learning methods can be applied to multiple tasks, for example, to train visual learning agents. Improving these machine-learning strategies is important for the development of efficient intelligent agents, like robots or vehicles, with the ability to explore and learn from their surroundings.
- Published
- 2024
- Full Text
- View/download PDF