1. On the Road With 16 Neurons: Towards Interpretable and Manipulable Latent Representations for Visual Predictions in Driving Scenarios
- Author
-
Alice Plebe and Mauro Da Lio
- Subjects
Autonomous driving ,convergence-divergence zones ,deep learning ,predictive brain ,variational autoencoder ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This paper proposes a strategy for visual perception in the context of autonomous driving. Humans, when not distracted or drunk, are still the best drivers you can currently find. For this reason, we take inspiration from two theoretical ideas about the human mind and its neural organization. The first idea concerns how the brain uses structures of neuron ensembles that expand and compress information to extract abstract concepts from visual experience and code them into compact representations. The second idea suggests that these neural perceptual representations are not neutral but functional to predicting the future state of affairs in the environment. Similarly, the prediction mechanism is not neutral but oriented to the planning of future action. We identify within the deep learning framework two artificial counterparts of the aforementioned neurocognitive theories. We find a correspondence between the first theoretical idea and the architecture of convolutional autoencoders, while we translate the second theory into a training procedure that learns compact representations which are not neutral but oriented to driving tasks, from two distinct perspectives. From a static perspective, we force separate groups of neural units in the compact representations to represent specific concepts crucial to the driving task distinctly. From a dynamic perspective, we bias the compact representations to predict how the current road scenario will change in the future. We successfully learn compact representations that use as few as 16 neural units for each of the two basic driving concepts we consider: cars and lanes. We maintain the two concepts separated in the latent space to facilitate the interpretation and manipulation of the perceptual representations. The source code for this paper is available at https://github.com/3lis/rnn_vae.
- Published
- 2020
- Full Text
- View/download PDF