20 results on '"Don Wunsch"'
Search Results
2. Guidance in the Use of Adaptive Critics for Control
- Author
-
Jennie Si, Warren B. Powell, Andrew G. Barto, and Don Wunsch
- Subjects
Dynamic programming ,Adaptive critics ,business.industry ,Management science ,Computer science ,Control (management) ,Reinforcement learning ,Artificial intelligence ,business - Abstract
This chapter, along with Chapter 3, provides an overview of several ADP design techniques. While Chapter 3 deals more with the theoretical foundations, Chapter 4 is more devoted to practical issues such as problem formulation and utility functions. The authors discuss issues associated with designing and training adaptive critics using the design techniques introduced in Chapter 3.
- Published
- 2009
- Full Text
- View/download PDF
3. The Linear Programming Approach to Approximate Dynamic Programming
- Author
-
Warren B. Powell, Don Wunsch, Andrew G. Barto, and Jennie Si
- Subjects
Dynamic programming ,Constraint (information theory) ,Queueing theory ,Mathematical optimization ,Linear programming ,Computer science ,Approximation error ,Markov decision process ,Curse of dimensionality ,Dual (category theory) - Abstract
This chapter addresses the issue of the ?>curse of dimensionality ?> by treating ADP as the ?>dual ?> of the linear programming problem and introduces the concept of approximate linear programming (ALP). It provides a brief introduction to the use of Markov Decision Process models. For a more comprehensive study of MDP models, and the techniques that can be used with them, read Chapters 11 and 12. This chapter discusses the performance of approximate LP policies, approximation error bounds, and provides an application to queueing networks. Another queueing network example can be found in Chapter 12. The chapter finishes with an efficient constraint sampling scheme.
- Published
- 2009
- Full Text
- View/download PDF
4. ADP: Goals, Opportunities and Principles
- Author
-
Warren B. Powell, Andrew G. Barto, Don Wunsch, and Jennie Si
- Subjects
Computer science ,Management science ,Field (Bourdieu) ,Hamilton–Jacobi–Bellman equation ,Bibliography ,Context (language use) ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) - Abstract
This chapter contains sections titled: Goals of This Book Funding Issues, Opportunities and the Larger Context Unifying Mathematical Principles and Roadmap of the Field Bibliography
- Published
- 2009
- Full Text
- View/download PDF
5. Reinforcement Learning in Large, High-Dimensional State Spaces
- Author
-
Andrew G. Barto, Jennie Si, Warren B. Powell, and Don Wunsch
- Subjects
Mathematical optimization ,Linear programming ,Rate of convergence ,Boundary (topology) ,Reinforcement learning ,Automatic test pattern generation ,Temporal difference learning ,Curse of dimensionality ,Dual (category theory) ,Mathematics - Abstract
The previous chapter addresses the ?>curse of dimensionality ?> by treating ADP as the dual of the linear programming problem and introduces the method known as approximate linear programming. This chapter presents another method for dealing with the "curse of dimensionality," the policy gradient reinforcement learning framework. The Action Transition Policy Gradient (ATPG) algorithm presented here estimates a gradient in the policy space that increases reward. Following a brief motivation the authors present their algorithm in detail and discuss its properties. Finally, detailed experimental results are presented to show the types of problems that the algorithm can be applied to and what type of performance can be expected. Another algorithm, Boundary Localized Reinforcement Learning, is also discussed in this chapter. This is a mode switching controller that can be used to increase the rate of convergence.
- Published
- 2009
- Full Text
- View/download PDF
6. Robust Reinforcement Learning Using IntegralQuadratic Constraints
- Author
-
Jennie Si, Don Wunsch, Andrew G. Barto, and Warren Buckler Powell
- Subjects
Computer science ,business.industry ,Reinforcement learning ,Artificial intelligence ,Robust control ,business ,Machine learning ,computer.software_genre ,computer - Published
- 2009
- Full Text
- View/download PDF
7. Reinforcement Learning and Its Relationship to Supervised Learning
- Author
-
Don Wunsch, Andrew G. Barto, Jennie Si, and Warren B. Powell
- Subjects
Learning classifier system ,Computer science ,Active learning (machine learning) ,business.industry ,Algorithmic learning theory ,Online machine learning ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Robot learning ,Reinforcement learning ,Unsupervised learning ,Artificial intelligence ,business ,computer - Abstract
This chapter focuses on presenting some key concepts of machine learning, approximate dynamic programming, and the relationships between them. Discussion and comparisons are made based on various aspects of the two fields such as training information, behavioral variety, problem conversion, applicable tasks, and so forth. This chapter mentions many real-world examples to illustrate some of the important distinctions being made. The primary focus of this chapter is a discussion of the concepts and strategies of machine learning, not necessarily algorithmic details. This chapter provides high-level perspective on machine learning and approximate dynamic programming.
- Published
- 2009
- Full Text
- View/download PDF
8. Multiobjective Control Problems by Reinforcement Learning
- Author
-
Jennie Si, Warren B. Powell, Don Wunsch, and Andrew G. Barto
- Subjects
Dynamic programming ,Mathematical optimization ,Computer science ,Convergence (routing) ,Reinforcement learning ,Fuzzy control system ,Control (linguistics) ,Multi-objective optimization - Abstract
Chapter 11 used hierarchical methods to solve multi-objective tasks. This chapter takes a different approach, using fuzzy control techniques. The mathematical background of multi-objective control and optimization is provided and a framework for an ADP algorithm with vector-valued rewards is introduced. Theoretical analyses are given to show certain convergence properties. A detailed algorithm implementation is presented, along with a cart-pole example.
- Published
- 2009
- Full Text
- View/download PDF
9. Control, Optimization, Security, and Selfhealing of Benchmark Power Systems - The views expressed here are those of the authors, Momoh (on leave from Howard University), and Zivi and not the official views of NSF and U.S. Naval Academy, USA
- Author
-
Andrew G. Barto, Don Wunsch, Jennie Si, and Warren Buckler Powell
- Subjects
Electric power system ,Operations research ,Quality of service ,Political science ,Reliability (computer networking) ,Benchmark (computing) ,Stability (learning theory) ,Reconfigurability ,Electric power ,Industrial engineering ,Field (computer science) - Abstract
This chapter presents several challenging and benchmark problems from the field of power systems. The first benchmark is the IEEE 118 Bus commercial terrestrial Electrical Power System (EPS). The second benchmark represents a finite inertia hybrid ac/dc shipboard Integrated Power System (IPS). The analytic utility and Navy benchmark models and their respective simulations have been experimentally validated and have been used to determine system reliability, reconfigurability, stability, and security. The challenge is to provide novel control and optimization methods and tools to improve the quality of service despite natural and hostile disruptions under uncertain operating conditions. Along with these problems several smaller problems are also presented which demonstrate different aspects of the challenges of power system control. The purpose of this chapter is formulation of problems that ADP methods could be applied to, therefore the emphasis is more on detailed problem description and simulation, not on any particular solution.
- Published
- 2009
- Full Text
- View/download PDF
10. Supervised ActorCritic Reinforcement Learning
- Author
-
Warren Buckler Powell, Andrew G. Barto, Jennie Si, and Don Wunsch
- Subjects
Structure (mathematical logic) ,Supervisor ,Learning classifier system ,Computer science ,business.industry ,Intermittent control ,Supervised learning ,Machine learning ,computer.software_genre ,Unsupervised learning ,Reinforcement learning ,Artificial intelligence ,business ,computer ,Robotic arm - Abstract
Chapter 7 introduced policy gradients as a way to improve on stochastic search of the policy space when learning. This chapter presents supervised actor-critic reinforcement learning as another method for improving the effectiveness of learning. With this approach, a supervisor adds structure to a learning problem and supervised learning makes that structure part of an actor-critic framework for reinforcement learning. Theoretical background and a detailed algorithm description are provided, along with several examples that contain enough detail to make them easy to understand and possible to duplicate. These examples also illustrate the use of two kinds of supervisors: a feedback controller that is easily designed yet suboptimal, and a human operator providing intermittent control of a simulated robotic arm.
- Published
- 2009
- Full Text
- View/download PDF
11. ModelBased Adaptive Critic Designs
- Author
-
Jennie Si, Andrew G. Barto, Warren B. Powell, and Don Wunsch
- Subjects
Heuristic dynamic programming ,Perspective (geometry) ,business.industry ,Adaptive critics ,Control engineering ,Artificial intelligence ,Common framework ,DUAL (cognitive architecture) ,business ,Pseudocode ,Implementation ,Mathematics - Abstract
This chapter provides an overview of model-based adaptive critic designs including background, general algorithms, implementations, and comparisons. The authors begin by introducing the mathematical background of model-reference adaptive critic designs. Various ADP designs such as Heuristic Dynamic Programming (HDP), Dual HDP (DHP), Globalized DHP (GDHP), and Action-Dependent (AD) designs are examined from both a mathematical and implementation standpoint and put into perspective. Pseudocode is provided for many aspects of the algorithms. The chapter concludes with applications and examples. For another overview perspective that focuses more on implementation issues read Chapter 4: Guidance in the Use of Adaptive Critics for Control. Chapter 15 contains a comparison of DHP with back-propagation through time, building a common framework for comparing these methods.
- Published
- 2009
- Full Text
- View/download PDF
12. Adaptive Critic Based Neural Network for ControlConstrained Agile Missile
- Author
-
Warren Buckler Powell, Don Wunsch, Andrew G. Barto, and Jennie Si
- Subjects
Dynamic programming ,Engineering ,Air-to-air missile ,Optimization problem ,Missile ,Artificial neural network ,Control theory ,business.industry ,Control variable ,Control engineering ,business ,Envelope (motion) - Abstract
This chapter uses the adaptive critic approach, which was introduced in Chapters 3 and 4, to steer an agile missile with bounds on the angle of attack (control variable) from various initial Mach numbers to a given final Mach number in minimum time while completely reversing its flight path angle. While a typical adaptive critic consists of a critic and controller, the agile missile problem needs chunking in terms of the independent control variable and, therefore, cascades of critics and controllers. Detailed derivations of equations and conditions on the constraint boundary are provided. For numerical experiments, the authors consider vertical plane scenarios. Numerical results demonstrate some attractive features of the adaptive critic approach and show that this formulation works very well in guiding the missile to its final conditions for this state constrained optimization problem from an envelope of initial conditions.
- Published
- 2009
- Full Text
- View/download PDF
13. Hierarchical Decision Making
- Author
-
Andrew G. Barto, Don Wunsch, Jennie Si, and Warren B. Powell
- Subjects
business.industry ,Computer science ,Control (management) ,Partially observable Markov decision process ,Machine learning ,computer.software_genre ,Field (computer science) ,Task (project management) ,Reinforcement learning ,Artificial intelligence ,Markov decision process ,Decision process ,business ,computer - Abstract
As the field of reinforcement learning has advanced, interest in solving realistic control problems has increased. However, Markov Decision Process (MDP) models have not proven sufficient to the task. This has led to increased use of Semi-Markov Decision Process models and the development of Hierarchical Reinforcement Learning (HRL). This chapter is an overview of HRL beginning with a discussion of the problems with the standard MDP models, then presenting the theory behind HRL, and finishing with some actual HRL algorithms that have been proposed. To see some examples of how hierarchical methods perform, see Chapter 11.
- Published
- 2009
- Full Text
- View/download PDF
14. Robust Reinforcement Learning for Heating, Ventilation, and Air Conditioning Control of Buildings
- Author
-
Jennie Si, Warren B. Powell, Don Wunsch, and Andrew G. Barto
- Subjects
Engineering ,business.industry ,Air conditioning ,Control theory ,HVAC ,Control (management) ,Water cooling ,Reinforcement learning ,Control engineering ,The Internet ,Robust control ,business ,Simulation - Abstract
This chapter is a case study, implementing the technique presented in Chapter 13. A detailed problem formulation is presented for a heating and cooling system and a step-by-step solution is discussed. A combined PI and reinforcement learning controller is designed within a robust control framework and detailed simulation results are presented. An internet link is provided to a website where information on experiments is provided.
- Published
- 2009
- Full Text
- View/download PDF
15. Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability
- Author
-
Jennie Si, Andrew G. Barto, Don Wunsch, and Warren Buckler Powell
- Subjects
Structure (mathematical logic) ,Theoretical computer science ,Computer science ,business.industry ,Concurrency ,Process (computing) ,Probabilistic logic ,Statistical model ,Machine learning ,computer.software_genre ,Hierarchical database model ,Task (computing) ,Observability ,Artificial intelligence ,business ,computer - Abstract
In this chapter the authors summarize their research in hierarchical probabilistic models for decision making involving concurrent action, multiagent coordination, and hidden state estimation in stochastic environments. A hierarchical model for learning concurrent plans is first described for observable single agent domains, which combines compact state representations with temporal process abstractions to determine how to parallelize multiple threads of activity. A hierarchical model for multiagent coordination is then presented, where primitive joint actions and joint states are hidden. Here, high-level coordination is learned by exploiting overall task structure, which greatly speeds up convergence by abstracting from low-level steps that do not need to be synchronized. Finally, a hierarchical frame-work for hidden state estimation and action is presented, based on multi-resolution statistical modeling of the past history of observations and actions.
- Published
- 2009
- Full Text
- View/download PDF
16. NearOptimal Control Through Reinforcement Learning and Hybridization
- Author
-
Andrew G. Barto, Jennie Si, Warren B. Powell, and Don Wunsch
- Subjects
Mathematical optimization ,Line search ,Computer science ,Control theory ,Convergence (routing) ,State space ,Reinforcement learning ,Function (mathematics) ,Additive model ,Fuzzy logic - Abstract
This chapter focuses on learning to act in a near-optimal manner through reinforcement learning for problems that either have no model or whose model is very complex. The emphasis here is on continuous action space (CAS) methods. Monte-Carlo approaches are employed to estimate function values in an iterative, incremental procedure. Derivative-free line search methods are used to find a near-optimal action in the continuous action space for a discrete subset of the state space. This near-optimal policy is then extended to the entire continuous state space using a fuzzy additive model. To compensate for approximation errors, a modified procedure for perturbing the generated control policy is developed. Convergence results, under moderate assumptions and stopping criteria, are established. References to sucessful applications of the controller are provided.
- Published
- 2009
- Full Text
- View/download PDF
17. Backpropagation Through Time and Derivative Adaptive CriticsA Common Framework for Comparison Portions of this chapter were previously published in [4, 7,9, 1214,23]
- Author
-
Warren Buckler Powell, Jennie Si, Don Wunsch, and Andrew G. Barto
- Subjects
Recurrent neural network ,Derivative (finance) ,Adaptive critics ,business.industry ,Computer science ,Heuristic programming ,Backpropagation through time ,Common framework ,Artificial intelligence ,DUAL (cognitive architecture) ,Pseudocode ,business - Abstract
This chapter compares and contrasts derivative adaptive critics (DAC) such as dual heuristic programming (DHP), which was first introduced in Chapter 1 and also discussed in Chapter 3 with back-propagation through time (BPTT). A common framework is built and it is shown that both are techniques for determining the derivatives for training parameters in recurrent neural networks. This chapter goes into sufficient mathematical detail that the reader can understand the theoretical relationship between the two techniques. The author presents a hybrid technique that combines elements of both BPTT and DAC and provides detailed pseudocode. Computational issues and classes of challenging problems are discussed.
- Published
- 2009
- Full Text
- View/download PDF
18. Improved Temporal Difference Methods with Linear Function Approximation
- Author
-
Jennie Si, Don Wunsch, Andrew G. Barto, and Warren Buckler Powell
- Subjects
Dynamic programming ,Mathematical optimization ,Linear function (calculus) ,Function approximation ,Bellman equation ,Convergence (routing) ,Applied mathematics ,Context (language use) ,Temporal difference learning ,Least squares ,Mathematics - Abstract
We consider temporal difference algorithms within the context of infinite-horizon finite-state dynamic programming problems with discounted cost, and linear cost function approximation. We show, under standard assumptions, that a least squares-based temporal difference method, proposed by Nedic and Bertsekas (NeB03), converges with a stepsize equal to 1. To our knowl- edge, this is the first iterative temporal difference method that converges without requiring a diminishing stepsize. We discuss the connections of the method with Sutton's TD(λ) and with various versions of least squares-based value iteration, and we show via analysis and experiment that the method is substantially and often dramatically faster than TD(λ), as well as simpler and more reliable. We also discuss the relation of our method with the LSTD method of Boyan (Boy02), and Bradtke and Barto (BrB96).
- Published
- 2009
- Full Text
- View/download PDF
19. Clustering
- Author
-
Rui Xu, Don Wunsch, Rui Xu, and Don Wunsch
- Subjects
- Cluster analysis
- Abstract
This is the first book to take a truly comprehensive look at clustering. It begins with an introduction to cluster analysis and goes on to explore: proximity measures; hierarchical clustering; partition clustering; neural network-based clustering; kernel-based clustering; sequential data clustering; large-scale data clustering; data visualization and high-dimensional data clustering; and cluster validation. The authors assume no previous background in clustering and their generous inclusion of examples and references help make the subject matter comprehensible for readers of varying levels and backgrounds.
- Published
- 2009
20. [Untitled]
- Author
-
Don Wunsch
- Subjects
Cognitive science ,Artificial neural network ,Artificial Intelligence ,business.industry ,Computer science ,Cognitive Neuroscience ,Artificial intelligence ,business - Published
- 1992
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.