1. A Structure of Restricted Boltzmann Machine for Modeling System Dynamics
- Author
-
Olivier Bach, Frédéric Alexandre, Denis Penninckx, Alain Hugget, Guillaume Padiolleau, Mnemonic Synergy (Mnemosyne), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut des Maladies Neurodégénératives [Bordeaux] (IMN), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), Commissariat à l'énergie atomique et aux énergies alternatives (CEA), IEEE, and Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest
- Subjects
Computer Science::Machine Learning ,Restricted Boltzmann machine ,Artificial neural network ,Unsupervised Deep Learning ,business.industry ,Stochastic process ,02 engineering and technology ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,System dynamics ,03 medical and health sciences ,0302 clinical medicine ,State Representation Learning ,0202 electrical engineering, electronic engineering, information engineering ,Unsupervised learning ,Reinforcement learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Representation (mathematics) ,Feature learning ,Factored Restricted Boltzmann Machine ,030217 neurology & neurosurgery - Abstract
International audience; This paper presents a new approach for learning transition function in state representation learning (SRL) for control. While state-of-the-art methods use different deterministic neural networks to learn forward and inverse state transition functions independently with auto-supervised learning, we introduce a bidirectional stochastic model to learn both transition functions. We aim at using the uncertainty of the model on its predictions as an intrinsic motivation for exploration to enhance the representation learning. More, using the same model to learn both transition functions allows sharing the parameters, which can reduce their number and should increase the embedding quality of the representation. We use a factored restricted Boltzmann machine (fRBM) based model, enhanced with dedicated structure for learning system dynamics and transitions with shared parameters. The presented work focuses on building the structure of the bidirectional transition model for unsupervised learning. Our fRBM structure is directly inspired from physics interactions between inputs and outputs in reinforcement learning framework. We compare different training algorithms for learning the model that must be able to predict observable random variables to be used in SRL framework. Our structure is not restricted to any type of observable, nevertheless in this paper we focus on learning dynamics from the OpenAI Gym environment Swinging Pendulum. We show that the proposed structure is able to learn bidirectional transition function and performs well in prediction task.
- Published
- 2020
- Full Text
- View/download PDF