27 results on '"Marc Peter Deisenroth"'
Search Results
2. Mathematics for Machine Learning
- Author
-
Cheng Soon Ong, Marc Peter Deisenroth, and A. Aldo Faisal
- Subjects
Point (typography) ,business.industry ,Probability and statistics ,Mixture model ,Machine learning ,computer.software_genre ,Test (assessment) ,Support vector machine ,Analytic geometry ,Linear algebra ,Artificial intelligence ,business ,computer ,Vector calculus - Abstract
The fundamental mathematical tools needed to understand machine learning include linear algebra, analytic geometry, matrix decompositions, vector calculus, optimization, probability and statistics. These topics are traditionally taught in disparate courses, making it hard for data science or computer science students, or professionals, to efficiently learn the mathematics. This self-contained textbook bridges the gap between mathematical and machine learning texts, introducing the mathematical concepts with a minimum of prerequisites. It uses these concepts to derive four central machine learning methods: linear regression, principal component analysis, Gaussian mixture models and support vector machines. For students and others with a mathematical background, these derivations provide a starting point to machine learning texts. For those learning the mathematics for the first time, the methods help build intuition and practical experience with applying mathematical concepts. Every chapter includes worked examples and exercises to test understanding. Programming tutorials are offered on the book's web site.
- Published
- 2020
3. Deep Reinforcement Learning: A Brief Survey
- Author
-
Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil A. Bharath
- Subjects
0209 industrial biotechnology ,Trust region ,Artificial neural network ,business.industry ,Computer science ,Applied Mathematics ,Deep learning ,Robotics ,02 engineering and technology ,Field (computer science) ,020901 industrial engineering & automation ,Asynchronous communication ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Robot ,Reinforcement learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higherlevel understanding of the visual world. Currently, deep learning is enabling reinforcement learning (RL) to scale to problems that were previously intractable, such as learning to play video games directly from pixels. DRL algorithms are also applied to robotics, allowing control policies for robots to be learned directly from camera inputs in the real world. In this survey, we begin with an introduction to the general field of RL, then progress to the main streams of value-based and policy-based methods. Our survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic. In parallel, we highlight the unique advantages of deep neural networks, focusing on visual understanding via RL. To conclude, we describe several current areas of research within the field.
- Published
- 2017
4. Gaussian Process Domain Experts for Modeling of Facial Affect
- Author
-
Ognjen Rudovic, Marc Peter Deisenroth, Stefanos Eleftheriadis, and Maja Pantic
- Subjects
Databases, Factual ,Computer science ,Normal Distribution ,Gaussian processes ,multiple AU detection ,Inference ,0801 Artificial Intelligence And Image Processing ,multi-view facial expression recognition ,Context (language use) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Pattern Recognition, Automated ,Data modeling ,Normal distribution ,symbols.namesake ,Image Processing, Computer-Assisted ,0202 electrical engineering, electronic engineering, information engineering ,Humans ,Artificial Intelligence & Image Processing ,Gaussian process ,Domain adaptation ,Facial expression ,business.industry ,0906 Electrical And Electronic Engineering ,Probabilistic logic ,020206 networking & telecommunications ,Pattern recognition ,1702 Cognitive Science ,Computer Graphics and Computer-Aided Design ,Weighting ,Facial Expression ,Face ,symbols ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Algorithms ,Software - Abstract
Most of existing models for facial behavior analysis rely on generic classifiers, which fail to generalize well to previously unseen data. This is because of inherent differences in source (training) and target (test) data, mainly caused by variation in subjects' facial morphology, camera views, and so on. All of these account for different contexts in which target and source data are recorded, and thus, may adversely affect the performance of the models learned solely from source data. In this paper, we exploit the notion of domain adaptation and propose a data efficient approach to adapt already learned classifiers to new unseen contexts. Specifically, we build upon the probabilistic framework of Gaussian processes (GPs), and introduce domain-specific GP experts (e.g., for each subject). The model adaptation is facilitated in a probabilistic fashion, by conditioning the target expert on the predictions from multiple source experts. We further exploit the predictive variance of each expert to define an optimal weighting during inference. We evaluate the proposed model on three publicly available data sets for multi-class (MultiPIE) and multi-label (DISFA, FERA2015) facial expression analysis by performing adaptation of two contextual factors: 'where' (view) and 'who' (subject). In our experiments, the proposed approach consistently outperforms: 1) both source and target classifiers, while using a small number of target examples during the adaptation and 2) related state-of-the-art approaches for supervised domain adaptation.
- Published
- 2017
5. Accelerating the BSM interpretation of LHC data with machine learning
- Author
-
Marc Peter Deisenroth, Jong Soo Kim, Max Welling, Roberto Ruiz de Austri, Gianfranco Bertone, Sebastian Liem, GRAPPA (ITFA, IoP, FNWI), Amsterdam Machine Learning lab (IVI, FNWI), IVI (FNWI), and Ministerio de Economía y Competitividad (España)
- Subjects
Physics ,Large Hadron Collider ,Luminosity (scattering theory) ,010308 nuclear & particles physics ,business.industry ,Physics beyond the Standard Model ,Detector ,SIGNAL (programming language) ,Astronomy and Astrophysics ,Supersymmetry ,Machine learning ,computer.software_genre ,01 natural sciences ,Naturalness ,Space and Planetary Science ,0103 physical sciences ,Artificial intelligence ,business ,010303 astronomy & astrophysics ,Event (particle physics) ,computer - Abstract
The interpretation of Large Hadron Collider (LHC) data in the framework of Beyond the Standard Model (BSM) theories is hampered by the need to run computationally expensive event generators and detector simulators. Performing statistically convergent scans of high-dimensional BSM theories is consequently challenging, and in practice unfeasible for very high-dimensional BSM theories. We present here a new machine learning method that accelerates the interpretation of LHC data, by learning the relationship between BSM theory parameters and data. As a proof-of-concept, we demonstrate that this technique accurately predicts natural SUSY signal events in two signal regions at the High Luminosity LHC, up to four orders of magnitude faster than standard techniques. The new approach makes it possible to rapidly and accurately reconstruct the theory parameters of complex BSM theories, should an excess in the data be discovered at the LHC.
- Published
- 2019
6. High-dimensional Bayesian optimization using low-dimensional feature spaces
- Author
-
Riccardo Moriconi, K. S. Sesh Kumar, and Marc Peter Deisenroth
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization problem ,Computer science ,Feature vector ,1702 Cognitive Sciences ,Machine Learning (stat.ML) ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Machine Learning (cs.LG) ,Artificial Intelligence ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,0801 Artificial Intelligence and Image Processing ,Artificial Intelligence & Image Processing ,0105 earth and related environmental sciences ,Bayesian optimization ,Data compression ratio ,Nonlinear system ,0806 Information Systems ,020201 artificial intelligence & image processing ,Algorithm ,Software ,Subspace topology ,Curse of dimensionality - Abstract
Bayesian optimization (BO) is a powerful approach for seeking the global optimum of expensive black-box functions and has proven successful for fine tuning hyper-parameters of machine learning models. However, BO is practically limited to optimizing 10–20 parameters. To scale BO to high dimensions, we usually make structural assumptions on the decomposition of the objective and/or exploit the intrinsic lower dimensionality of the problem, e.g. by using linear projections. We could achieve a higher compression rate with nonlinear projections, but learning these nonlinear embeddings typically requires much data. This contradicts the BO objective of a relatively small evaluation budget. To address this challenge, we propose to learn a low-dimensional feature space jointly with (a) the response surface and (b) a reconstruction mapping. Our approach allows for optimization of BO’s acquisition function in the lower-dimensional subspace, which significantly simplifies the optimization problem. We reconstruct the original parameter space from the lower-dimensional subspace for evaluating the black-box function. For meaningful exploration, we solve a constrained optimization problem.
- Published
- 2019
7. GPdoemd: a Python package for design of experiments for model discrimination
- Author
-
Sebastian Niedenführ, Simon Olofsson, Lukas Hebing, Marc Peter Deisenroth, and Ruth Misener
- Subjects
FOS: Computer and information sciences ,Computer science ,020209 energy ,General Chemical Engineering ,Optimal design of experiments ,Model parameters ,Machine Learning (stat.ML) ,02 engineering and technology ,Machine learning ,computer.software_genre ,symbols.namesake ,020401 chemical engineering ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,0204 chemical engineering ,Gaussian process ,computer.programming_language ,Mathematical model ,business.industry ,Design of experiments ,Experimental data ,Python (programming language) ,Computer Science Applications ,Approximate inference ,symbols ,Computer Science - Mathematical Software ,Artificial intelligence ,business ,computer ,Mathematical Software (cs.MS) - Abstract
Model discrimination identifies a mathematical model that usefully explains and predicts a given system’s behaviour. Researchers will often have several models, i.e. hypotheses, about an underlying system mechanism, but insufficient experimental data to discriminate between the models, i.e. discard inaccurate models. Given rival mathematical models and an initial experimental data set, optimal design of experiments suggests maximally informative experimental observations that maximise a design criterion weighted by prediction uncertainty. The model uncertainty requires gradients, which may not be readily available for black-box models. This paper (i) proposes a new design criterion using the Jensen-Renyi divergence, and (ii) develops a novel method replacing black-box models with Gaussian process surrogates. Using the surrogates, we marginalise out the model parameters with approximate inference. Results show these contributions working well for both classical and new test instances. We also (iii) introduce and discuss GPdoemd, the open-source implementation of the Gaussian process surrogate method.
- Published
- 2018
- Full Text
- View/download PDF
8. Bayesian optimization for learning gaits under uncertainty
- Author
-
Marc Peter Deisenroth, Andre Seyfarth, Roberto Calandra, and Jan Peters
- Subjects
0209 industrial biotechnology ,Meta-optimization ,Computer science ,business.industry ,Applied Mathematics ,Probabilistic-based design optimization ,Bayesian optimization ,Context (language use) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Gait ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Test functions for optimization ,Robot ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Robot locomotion - Abstract
Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parametrization, finding near-optimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date. However, no extensive comparison among them has yet been performed. In this article, we thoroughly discuss multiple automatic optimization methods in the context of gait optimization. We extensively evaluate Bayesian optimization, a model-based approach to black-box optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments.
- Published
- 2015
9. Customer Lifetime Value Prediction Using Embeddings
- Author
-
Marc Peter Deisenroth, Benjamin Paul Chamberlain, Angelo Cardoso, Roberto Pagliari, and C. H. Bryan Liu
- Subjects
FOS: Computer and information sciences ,Random Forests ,Technology ,Neural Networks ,Computer science ,cs.LG ,Future value ,Machine Learning (stat.ML) ,Customer Lifetime Value ,E-commerce ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Science, Artificial Intelligence ,Computer Science - Information Retrieval ,Machine Learning (cs.LG) ,Loyalty business model ,Domain (software engineering) ,Set (abstract data type) ,Computer Science - Computers and Society ,Statistics - Machine Learning ,Computer Science, Theory & Methods ,020204 information systems ,Computers and Society (cs.CY) ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Neural and Evolutionary Computing (cs.NE) ,cs.NE ,cs.CY ,Science & Technology ,Computer Science, Information Systems ,business.industry ,Computer Science - Neural and Evolutionary Computing ,cs.IR ,Customer lifetime value ,stat.ML ,Embeddings ,Product (business) ,Computer Science - Learning ,Computer Science ,Value (economics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Information Retrieval (cs.IR) - Abstract
We describe the Customer LifeTime Value (CLTV) prediction system deployed at ASOS.com, a global online fashion retailer. CLTV prediction is an important problem in e-commerce where an accurate estimate of future value allows retailers to effectively allocate marketing spend, identify and nurture high value customers and mitigate exposure to losses. The system at ASOS provides daily estimates of the future value of every customer and is one of the cornerstones of the personalised shopping experience. The state of the art in this domain uses large numbers of handcrafted features and ensemble regressors to forecast value, predict churn and evaluate customer loyalty. Recently, domains including language, vision and speech have shown dramatic advances by replacing handcrafted features with features that are learned automatically from data. We detail the system deployed at ASOS and show that learning feature representations is a promising extension to the state of the art in CLTV modelling. We propose a novel way to generate embeddings of customers, which addresses the issue of the ever changing product catalogue and obtain a significant improvement over an exhaustive set of handcrafted features., Comment: 10 pages, 11 figures
- Published
- 2017
10. Probabilistic movement modeling for intention inference in human–robot interaction
- Author
-
Bernhard Schölkopf, Marc Peter Deisenroth, Zhikun Wang, Katharina Mülling, David Vogt, Heni Ben Amor, and Jan Peters
- Subjects
0209 industrial biotechnology ,Computer science ,Inference ,02 engineering and technology ,Machine learning ,computer.software_genre ,Human–robot interaction ,Bayes' theorem ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,business.industry ,Applied Mathematics ,Mechanical Engineering ,Probabilistic logic ,Robotics ,Approximate inference ,Modeling and Simulation ,Robot ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Software ,Humanoid robot - Abstract
Intention inference can be an essential step toward efficient human–robot interaction. For this purpose, we propose the Intention-Driven Dynamics Model (IDDM) to probabilistically model the generative process of movements that are directed by the intention. The IDDM allows the intention to be inferred from observed movements using Bayes’ theorem. The IDDM simultaneously finds a latent state representation of noisy and high-dimensional observations, and models the intention-driven dynamics in the latent states. As most robotics applications are subject to real-time constraints, we develop an efficient online algorithm that allows for real-time intention inference. Two human–robot interaction scenarios, i.e. target prediction for robot table tennis and action recognition for interactive humanoid robots, are used to evaluate the performance of our inference algorithm. In both intention inference tasks, the proposed algorithm achieves substantial improvements over support vector machines and Gaussian processes.
- Published
- 2013
11. Social and Affective Robotics Tutorial
- Author
-
Bjoern Schuller, Vanessa Evers, Maja Pantic, Marc Peter Deisenroth, and Luis Merino
- Subjects
HMI-HF: Human Factors ,Social robot ,Computer science ,business.industry ,Field (Bourdieu) ,EWI-27599 ,020207 software engineering ,Robotics ,02 engineering and technology ,Affect (psychology) ,Multidisciplinary approach ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,Robot ,020201 artificial intelligence & image processing ,Engineering ethics ,EC Grant Agreement nr.: FP7/288235 ,IR-104074 ,Artificial intelligence ,Affective computing ,business ,EC Grant Agreement nr.: FP7/611153 - Abstract
Social and Affective Robotics is a growing multidisciplinary field encompassing computer science, engineering, psychology, education, and many other disciplines. It explores how social and affective factors influence interactions between humans and robots, and how affect and social signals can be sensed and integrated into the design, implementation, and evaluation of robots. With talks by renowned researchers in this area, Social and Affective Robotics Tutorial will help both new and experienced researchers to identify trends, concepts, methodologies and applications in this field, identified as a technological megatrend driving the fourth industrial revolution.
- Published
- 2016
- Full Text
- View/download PDF
12. Learning torque control in presence of contacts using tactile sensing from robot skin
- Author
-
Roberto Calandra, Serena Ivaldi, Marc Peter Deisenroth, Peters, Intelligent Autonomous Systems (IAS), Technische Universität Darmstadt (TU Darmstadt), Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment (LARSEN), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Imperial College London, Max Planck Institute for Intelligent Systems, Max-Planck-Gesellschaft, European Project: 600716,EC:FP7:ICT,FP7-ICT-2011-9,CODYCO(2013), Technische Universität Darmstadt - Technical University of Darmstadt (TU Darmstadt), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), and Max Planck Institute for Intelligent Systems [Tübingen]
- Subjects
robotics ,0209 industrial biotechnology ,Computer science ,business.industry ,Robotics ,02 engineering and technology ,Contact force ,Inverse dynamics ,Robot control ,[SPI.AUTO]Engineering Sciences [physics]/Automatic ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,03 medical and health sciences ,020901 industrial engineering & automation ,0302 clinical medicine ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Robot ,Computer vision ,Artificial intelligence ,business ,030217 neurology & neurosurgery ,Tactile sensor ,iCub ,Humanoid robot ,Simulation - Abstract
International audience; Whole-body control in unknown environments is challenging: Unforeseen contacts with obstacles can lead to poor tracking performance and potential physical damages of the robot. Hence, a whole-body control approach for future humanoid robots in (partially) unknown environments needs to take contact sensing into account, e.g., by means of artificial skin. However, translating contacts from skin measurements into physically well-understood quantities can be problematic as the exact position and strength of the contact needs to be converted into torques. In this paper, we suggest an alternative approach that directly learns the mapping from both skin and the joint state to torques. We propose to learn such an inverse dynamics models with contacts using a mixture- of-contacts approach that exploits the linear superimposition of contact forces. The learned model can, making use of uncalibrated tactile sensors, accurately predict the torques needed to compensate for the contact. As a result, tracking of trajectories with obstacles and tactile contact can be executed more accurately. We demonstrate on the humanoid robot iCub that our approach improve the tracking error in presence of dynamic contacts.
- Published
- 2015
13. Gaussian Processes for Data-Efficient Learning in Robotics and Control
- Author
-
Dieter Fox, Marc Peter Deisenroth, Carl Edward Rasmussen, Rasmussen, Carl [0000-0001-8899-7850], and Apollo - University of Cambridge Repository
- Subjects
FOS: Computer and information sciences ,0209 industrial biotechnology ,Proactive learning ,reinforcement learning ,Computer science ,Active learning (machine learning) ,Bayesian inference ,Stability (learning theory) ,Multi-task learning ,Gaussian processes ,Machine Learning (stat.ML) ,02 engineering and technology ,Semi-supervised learning ,Systems and Control (eess.SY) ,Machine learning ,computer.software_genre ,Robot learning ,Machine Learning (cs.LG) ,Computer Science - Robotics ,020901 industrial engineering & automation ,Inductive transfer ,Artificial Intelligence ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,FOS: Electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,Instance-based learning ,robotics ,business.industry ,Applied Mathematics ,Algorithmic learning theory ,Probabilistic logic ,Online machine learning ,Generalization error ,Computer Science - Learning ,Computational Theory and Mathematics ,Policy search ,Unsupervised learning ,Robot ,Computer Science - Systems and Control ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,control ,Robotics (cs.RO) ,Software - Abstract
Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this article, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks., 20 pages, 29 figures; fixed a typo in equation on page 8
- Published
- 2015
14. Learning deep dynamical models from image pixels
- Author
-
Marc Peter Deisenroth, Thomas B. Schön, and Niklas Wahlström
- Subjects
FOS: Computer and information sciences ,Theoretical computer science ,Dynamical systems theory ,Computer science ,Machine Learning (stat.ML) ,Signalbehandling ,Systems and Control (eess.SY) ,Machine Learning (cs.LG) ,Statistics - Machine Learning ,Deep neural networks ,FOS: Electrical engineering, electronic engineering, information engineering ,Neural and Evolutionary Computing (cs.NE) ,system identification ,Signal processing ,Pixel ,business.industry ,auto-encoder ,System identification ,Computer Science - Neural and Evolutionary Computing ,Robotics ,Autoencoder ,Computer Science - Learning ,Nonlinear system ,Control and Systems Engineering ,Signal Processing ,Computer Science - Systems and Control ,Artificial intelligence ,nonlinear systems ,business ,low-dimensional embedding - Abstract
Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement mapping and the transition mapping (system dynamics) in latent space can be challenging. For linear system dynamics and measurement mappings efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only., Comment: 10 pages, 11 figures
- Published
- 2014
- Full Text
- View/download PDF
15. Policy search for learning robot control using sparse data
- Author
-
Marc Peter Deisenroth, Duy Nguyen-Tuong, H. van Hoof, Carl Edward Rasmussen, Alois Knoll, A. McHutchon, Jan Peters, and Bastian Bischoff
- Subjects
Proactive learning ,Wake-sleep algorithm ,Computer science ,Active learning (machine learning) ,Competitive learning ,Stability (learning theory) ,Multi-task learning ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Robot learning ,Reinforcement learning ,Instance-based learning ,Learning classifier system ,business.industry ,Algorithmic learning theory ,Probabilistic logic ,Online machine learning ,Generalization error ,Robot control ,Computational learning theory ,Robot ,Unsupervised learning ,Artificial intelligence ,Temporal difference learning ,business ,computer - Abstract
© 2014 IEEE. In many complex robot applications, such as grasping and manipulation, it is difficult to program desired task solutions beforehand, as robots are within an uncertain and dynamic environment. In such cases, learning tasks from experience can be a useful alternative. To obtain a sound learning and generalization performance, machine learning, especially, reinforcement learning, usually requires sufficient data. However, in cases where only little data is available for learning, due to system constraints and practical issues, reinforcement learning can act suboptimally. In this paper, we investigate how model-based reinforcement learning, in particular the probabilistic inference for learning control method (Pilco), can be tailored to cope with the case of sparse data to speed up learning. The basic idea is to include further prior knowledge into the learning process. As Pilco is built on the probabilistic Gaussian processes framework, additional system knowledge can be incorporated by defining appropriate prior distributions, e.g. A linear mean Gaussian prior. The resulting Pilco formulation remains in closed form and analytically tractable. The proposed approach is evaluated in simulation as well as on a physical robot, the Festo Robotino XT. For the robot evaluation, we employ the approach for learning an object pick-up task. The results show that by including prior knowledge, policy learning can be sped up in presence of sparse data.
- Published
- 2014
16. Multi-task policy search for robotics
- Author
-
Jan Peters, Marc Peter Deisenroth, Peter Englert, and Dieter Fox
- Subjects
business.industry ,Computer science ,media_common.quotation_subject ,Evolutionary robotics ,Robotics ,Machine learning ,computer.software_genre ,Robot learning ,Task (project management) ,Key (cryptography) ,Reinforcement learning ,Artificial intelligence ,State (computer science) ,Reinforcement ,Function (engineering) ,business ,computer ,media_common - Abstract
© 2014 IEEE.Learning policies that generalize across multiple tasks is an important and challenging research topic in reinforcement learning and robotics. Training individual policies for every single potential task is often impractical, especially for continuous task variations, requiring more principled approaches to share and transfer knowledge among similar tasks. We present a novel approach for learning a nonlinear feedback policy that generalizes across multiple tasks. The key idea is to define a parametrized policy as a function of both the state and the task, which allows learning a single policy that generalizes across multiple known and unknown tasks. Applications of our novel approach to reinforcement and imitation learning in realrobot experiments are shown.
- Published
- 2014
17. Bayesian Gait Optimization for Bipedal Locomotion
- Author
-
Marc Peter Deisenroth, Jan Peters, Andre Seyfarth, Roberto Calandra, and Nakul Gopalan
- Subjects
Stochastic control ,business.industry ,Computer science ,Bayesian optimization ,Bayesian probability ,Machine learning ,computer.software_genre ,Gait ,Computer Science::Robotics ,ComputingMethodologies_PATTERNRECOGNITION ,Computer Science::Systems and Control ,Robustness (computer science) ,Proof of concept ,Robot ,Artificial intelligence ,business ,Performance metric ,computer - Abstract
One of the key challenges in robotic bipedal locomotion is finding gait parameters that optimize a desired performance criterion, such as speed, robustness or energy efficiency. Typically, gait optimization requires extensive robot experiments and specific expert knowledge. We propose to apply data-driven machine learning to automate and speed up the process of gait optimization. In particular, we use Bayesian optimization to efficiently find gait parameters that optimize the desired performance metric. As a proof of concept we demonstrate that Bayesian optimization is near-optimal in a classical stochastic optimal control framework. Moreover, we validate our approach to Bayesian gait optimization on a low-cost and fragile real bipedal walker and show that good walking gaits can be efficiently found by Bayesian optimization. © 2014 Springer International Publishing.
- Published
- 2014
18. Model-based contextual policy search for data-efficient generalization of robot skills
- Author
-
Gerhard Neumann, Ai Poh Loh, Marc Peter Deisenroth, Jan Peters, Andras Kupcsik, and Prahlad Vadakkepat
- Subjects
0209 industrial biotechnology ,Linguistics and Language ,Computer science ,Process (engineering) ,media_common.quotation_subject ,Context (language use) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Robot learning ,Language and Linguistics ,020901 industrial engineering & automation ,Artificial Intelligence ,Search algorithm ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,Function (engineering) ,media_common ,business.industry ,Probabilistic logic ,H671 Robotics ,Robot ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,G760 Machine Learning ,computer - Abstract
© 2014 Elsevier B.V.In robotics, lower-level controllers are typically used to make the robot solve a specific task in a fixed context. For example, the lower-level controller can encode a hitting movement while the context defines the target coordinates to hit. However, in many learning problems the context may change between task executions. To adapt the policy to a new context, we utilize a hierarchical approach by learning an upper-level policy that generalizes the lower-level controllers to new contexts. A common approach to learn such upper-level policies is to use policy search. However, the majority of current contextual policy search approaches are model-free and require a high number of interactions with the robot and its environment. Model-based approaches are known to significantly reduce the amount of robot experiments, however, current model-based techniques cannot be applied straightforwardly to the problem of learning contextual upper-level policies. They rely on specific parametrizations of the policy and the reward function, which are often unrealistic in the contextual policy search formulation. In this paper, we propose a novel model-based contextual policy search algorithm that is able to generalize lower-level controllers, and is data-efficient. Our approach is based on learned probabilistic forward models and information theoretic policy search. Unlike current algorithms, our method does not require any assumption on the parametrization of the policy or the reward function. We show on complex simulated robotic tasks and in a real robot experiment that the proposed learning framework speeds up the learning process by up to two orders of magnitude in comparison to existing methods, while learning high quality policies.
- Published
- 2014
19. An Experimental Comparison of Bayesian Optimization for Bipedal Locomotion
- Author
-
Jan Peters, Roberto Calandra, Marc Peter Deisenroth, and Andre Seyfarth
- Subjects
business.industry ,Computer science ,Bayesian optimization ,Machine learning ,computer.software_genre ,Gait ,Control theory ,Key (cryptography) ,Robot ,Artificial intelligence ,Bipedalism ,business ,computer ,Robot locomotion - Abstract
The design of gaits and corresponding control policies for bipedal walkers is a key challenge in robot locomotion. Even when a viable controller parametrization already exists, finding near-optimal parameters can be daunting. The use of automatic gait optimization methods greatly reduces the need for human expertise and time-consuming design processes. Many different approaches to automatic gait optimization have been suggested to date. However, no extensive comparison among them has yet been performed. In this paper, we present some common methods for automatic gait optimization in bipedal locomotion, and analyze their strengths and weaknesses. We experimentally evaluated these gait optimization methods on a bipedal robot, in more than 1800 experimental evaluations. In particular, we analyzed Bayesian optimization in different configurations, including various acquisition functions.
- Published
- 2014
20. Manifold Gaussian Processes for Regression
- Author
-
Marc Peter Deisenroth, Roberto Calandra, Carl Edward Rasmussen, and Jan Peters
- Subjects
FOS: Computer and information sciences ,Feature vector ,Machine Learning (stat.ML) ,02 engineering and technology ,01 natural sciences ,Machine Learning (cs.LG) ,010104 statistics & probability ,symbols.namesake ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,0101 mathematics ,Representation (mathematics) ,Gaussian process ,Mathematics ,Smoothness ,business.industry ,Supervised learning ,Pattern recognition ,Covariance ,Manifold ,Computer Science - Learning ,Transformation (function) ,symbols ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
26.03.14 KB. Ok toadd working paper to spiral, Off-the-shelf Gaussian Process (GP) covariance functions encode smoothness assumptions on the structure of the function to be modeled. To model complex and non-differentiable functions, these smoothness assumptions are often too restrictive. One way to alleviate this limitation is to find a different representation of the data by introducing a feature space. This feature space is often learned in an unsupervised way, which might lead to data representations that are not useful for the overall regression task. In this paper, we propose Manifold Gaussian Processes, a novel supervised method that jointly learns a transformation of the data into a feature space and a GP regression from the feature space to observed space. The Manifold GP is a full GP and allows to learn data representations, which are useful for the overall regression task. As a proof-of-concept, we evaluate our approach on complex non-smooth functions where standard GPs perform poorly, such as step functions and robotics tasks with contacts.
- Published
- 2014
- Full Text
- View/download PDF
21. Model-based imitation learning by probabilistic trajectory matching
- Author
-
Jan Peters, Alexandros Paraschos, Marc Peter Deisenroth, and Peter Englert
- Subjects
0209 industrial biotechnology ,Computer science ,business.industry ,Probabilistic logic ,02 engineering and technology ,Machine learning ,computer.software_genre ,Robot learning ,Task (project management) ,020901 industrial engineering & automation ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Trajectory ,Robot ,Reinforcement learning ,Probability distribution ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
One of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot.
- Published
- 2013
22. Probabilistic Modeling of Human Movements for Intention Inference
- Author
-
Jan Peters, Zhikun Wang, Marc Peter Deisenroth, Heni Ben Amor, Bernhard Schölkopf, and David Vogt
- Subjects
Computer science ,business.industry ,Probabilistic logic ,Inference ,Machine learning ,computer.software_genre ,Regression ,Approximate inference ,Table (database) ,Robot ,Artificial intelligence ,business ,Latent variable model ,computer ,Humanoid robot - Abstract
Inference of human intention may be an essential step towards understanding human actions [21] and is hence important for realizing efficient human-robot interaction. In this paper, we propose the Intention-Driven Dynamics Model (IDDM), a latent variable model for inferring unknown human intentions. We train the model based on observed human behaviors/actions and we introduce an approximate inference algorithm to efficiently infer the human’s intention from an ongoing action. We verify the feasibility of the IDDM in two scenarios, i.e., target inference in robot table tennis and action recognition for interactive humanoid robots. In both tasks, the IDDM achieves substantial improvements over state-of-the-art regression and classification.
- Published
- 2012
23. Toward Fast Policy Search for Learning Legged Locomotion
- Author
-
Jan Peters, Marc Peter Deisenroth, Roberto Calandra, and Andre Seyfarth
- Subjects
Engineering ,Adaptive control ,business.industry ,Probabilistic logic ,Control engineering ,Robotics ,Data modeling ,Gait (human) ,Compass ,Artificial intelligence ,business ,Humanoid robot ,ComputingMethodologies_COMPUTERGRAPHICS ,Curse of dimensionality - Abstract
Legged locomotion is one of the most versatile forms of mobility. However, despite the importance of legged locomotion and the large number of legged robotics studies, no biped or quadruped matches the agility and versatility of their biological counterparts to date. Approaches to designing controllers for legged locomotion systems are often based on either the assumption of perfectly known dynamics or mechanical designs that substantially reduce the dimensionality of the problem. The few existing approaches for learning controllers for legged systems either require exhaustive real-world data or they improve controllers only conservatively, leading to slow learning. We present a data-efficient approach to learning feedback controllers for legged locomotive systems, based on learned probabilistic forward models for generating walking policies. On a compass walker, we show that our approach allows for learning gait policies from very little data. Moreover, we analyze learned locomotion models of a biomechanically inspired biped. Our approach has the potential to scale to high-dimensional humanoid robots with little loss in efficiency.
- Published
- 2012
24. Learning Deep Belief Networks from Non-stationary Streams
- Author
-
Federico Montesino Pouzols, Marc Peter Deisenroth, Tapani Raiko, and Roberto Calandra
- Subjects
Concept drift ,Computer science ,business.industry ,Data stream mining ,Deep learning ,02 engineering and technology ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Deep belief network ,Generative model ,Constant (computer programming) ,020204 information systems ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Adaptive learning ,Artificial intelligence ,business ,computer - Abstract
Deep learning has proven to be beneficial for complex tasks such as classifying images. However, this approach has been mostly applied to static datasets. The analysis of non-stationary (e.g., concept drift) streams of data involves specific issues connected with the temporal and changing nature of the data. In this paper, we propose a proof-of-concept method, called Adaptive Deep Belief Networks, of how deep learning can be generalized to learn online from changing streams of data. We do so by exploiting the generative properties of the model to incrementally re-train the Deep Belief Network whenever new data are collected. This approach eliminates the need to store past observations and, therefore, requires only constant memory consumption. Hence, our approach can be valuable for life-long learning from non-stationary data streams. © 2012 Springer-Verlag.
- Published
- 2012
25. Gambit: An autonomous chess-playing robotic system
- Author
-
Mike Kung, Liefeng Bo, Joshua R. Smith, Roberto Aimi, Robert Chu, Dieter Fox, Marc Peter Deisenroth, Brian Mayton, Louis LeGrand, and Cynthia Matuszek
- Subjects
Engineering ,Gambit ,business.industry ,Testbed ,ComputingMilieux_PERSONALCOMPUTING ,Cognitive neuroscience of visual object recognition ,Robot manipulator ,Object detection ,Robotic systems ,Human–computer interaction ,Robot vision ,Artificial intelligence ,Manipulator ,business - Abstract
This paper presents Gambit, a custom, mid-cost 6-DoF robot manipulator system that can play physical board games against human opponents in non-idealized environments. Historically, unconstrained robotic manipulation in board games has often proven to be more challenging than the underlying game reasoning, making it an ideal testbed for small-scale manipulation. The Gambit system includes a low-cost Kinect-style visual sensor, a custom manipulator, and state-of-the-art learning algorithms for automatic detection and recognition of the board and objects on it. As a use-case, we describe playing chess quickly and accurately with arbitrary, uninstrumented boards and pieces, demonstrating that Gambits engineering and design represent a new state-of-the-art in fast, robust tabletop manipulation. © 2011 IEEE.
- Published
- 2011
26. Probabilistic Inference for Fast Learning in Control
- Author
-
Carl Edward Rasmussen, Marc Peter Deisenroth, Girgin, S, Loth, M, Munos, R, Preux, P, and Ryabko, D
- Subjects
Learning classifier system ,Computer science ,business.industry ,Divergence-from-randomness model ,Probabilistic logic ,Machine learning ,computer.software_genre ,Task (project management) ,symbols.namesake ,symbols ,Reinforcement learning ,State (computer science) ,Artificial intelligence ,business ,computer ,Gaussian process ,Probabilistic relevance model - Abstract
We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations.
- Published
- 2008
27. Gaussian Process Domain Experts for Model Adaptation in Facial Behavior Analysis
- Author
-
Maja Pantic, Marc Peter Deisenroth, Stefanos Eleftheriadis, and Ognjen Rudovic
- Subjects
FOS: Computer and information sciences ,HMI-HF: Human Factors ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,cs.LG ,Computer Science - Computer Vision and Pattern Recognition ,Gaussian processes ,Machine Learning (stat.ML) ,02 engineering and technology ,Machine learning ,computer.software_genre ,METIS-320877 ,Machine Learning (cs.LG) ,symbols.namesake ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,Gaussian process ,cs.CV ,Facial expression ,business.industry ,Probabilistic logic ,020206 networking & telecommunications ,stat.ML ,facial behavior analysis ,Computer Science - Learning ,ComputingMethodologies_PATTERNRECOGNITION ,symbols ,020201 artificial intelligence & image processing ,Artificial intelligence ,EWI-27133 ,business ,Classifier (UML) ,computer ,IR-103096 - Abstract
We present a novel approach for supervised domain adaptation that is based upon the probabilistic framework of Gaussian processes (GPs). Specifically, we introduce domain-specific GPs as local experts for facial expression classification from face images. The adaptation of the classifier is facilitated in probabilistic fashion by conditioning the target expert on multiple source experts. Furthermore, in contrast to existing adaptation approaches, we also learn a target expert from available target data solely. Then, a single and confident classifier is obtained by combining the predictions from multiple experts based on their confidence. Learning of the model is efficient and requires no retraining/ reweighting of the source classifiers. We evaluate the proposed approach on two publicly available datasets for multi-class (MultiPIE) and multi-label (DISFA) facial expression classification. To this end, we perform adaptation of two contextual factors: 'where' (view) and 'who' (subject). We show in our experiments that the proposed approach consistently outperforms both source and target classifiers, while using as few as 30 target examples. It also outperforms the state-of-the-art approaches for supervised domain adaptation.
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.