Author: "Don Wunsch" - Searchworks@Jio Institute Digital Library Search Results

2. Guidance in the Use of Adaptive Critics for Control

Author: Jennie Si, Warren B. Powell, Andrew G. Barto, and Don Wunsch
Subjects: Dynamic programming, Adaptive critics, business.industry, Management science, Computer science, Control (management), Reinforcement learning, Artificial intelligence, business
Abstract: This chapter, along with Chapter 3, provides an overview of several ADP design techniques. While Chapter 3 deals more with the theoretical foundations, Chapter 4 is more devoted to practical issues such as problem formulation and utility functions. The authors discuss issues associated with designing and training adaptive critics using the design techniques introduced in Chapter 3.
Published: 2009
Full Text: View/download PDF

3. The Linear Programming Approach to Approximate Dynamic Programming

Author: Warren B. Powell, Don Wunsch, Andrew G. Barto, and Jennie Si
Subjects: Dynamic programming, Constraint (information theory), Queueing theory, Mathematical optimization, Linear programming, Computer science, Approximation error, Markov decision process, Curse of dimensionality, Dual (category theory)
Abstract: This chapter addresses the issue of the ?>curse of dimensionality ?> by treating ADP as the ?>dual ?> of the linear programming problem and introduces the concept of approximate linear programming (ALP). It provides a brief introduction to the use of Markov Decision Process models. For a more comprehensive study of MDP models, and the techniques that can be used with them, read Chapters 11 and 12. This chapter discusses the performance of approximate LP policies, approximation error bounds, and provides an application to queueing networks. Another queueing network example can be found in Chapter 12. The chapter finishes with an efficient constraint sampling scheme.
Published: 2009
Full Text: View/download PDF

4. ADP: Goals, Opportunities and Principles

Author: Warren B. Powell, Andrew G. Barto, Don Wunsch, and Jennie Si
Subjects: Computer science, Management science, Field (Bourdieu), Hamilton–Jacobi–Bellman equation, Bibliography, Context (language use), GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries)
Abstract: This chapter contains sections titled: Goals of This Book Funding Issues, Opportunities and the Larger Context Unifying Mathematical Principles and Roadmap of the Field Bibliography
Published: 2009
Full Text: View/download PDF

5. Reinforcement Learning in Large, High-Dimensional State Spaces

Author: Andrew G. Barto, Jennie Si, Warren B. Powell, and Don Wunsch
Subjects: Mathematical optimization, Linear programming, Rate of convergence, Boundary (topology), Reinforcement learning, Automatic test pattern generation, Temporal difference learning, Curse of dimensionality, Dual (category theory), Mathematics
Abstract: The previous chapter addresses the ?>curse of dimensionality ?> by treating ADP as the dual of the linear programming problem and introduces the method known as approximate linear programming. This chapter presents another method for dealing with the "curse of dimensionality," the policy gradient reinforcement learning framework. The Action Transition Policy Gradient (ATPG) algorithm presented here estimates a gradient in the policy space that increases reward. Following a brief motivation the authors present their algorithm in detail and discuss its properties. Finally, detailed experimental results are presented to show the types of problems that the algorithm can be applied to and what type of performance can be expected. Another algorithm, Boundary Localized Reinforcement Learning, is also discussed in this chapter. This is a mode switching controller that can be used to increase the rate of convergence.
Published: 2009
Full Text: View/download PDF

6. Robust Reinforcement Learning Using IntegralQuadratic Constraints

Author: Jennie Si, Don Wunsch, Andrew G. Barto, and Warren Buckler Powell
Subjects: Computer science, business.industry, Reinforcement learning, Artificial intelligence, Robust control, business, Machine learning, computer.software_genre, computer
Published: 2009
Full Text: View/download PDF

7. Reinforcement Learning and Its Relationship to Supervised Learning

Author: Don Wunsch, Andrew G. Barto, Jennie Si, and Warren B. Powell
Subjects: Learning classifier system, Computer science, Active learning (machine learning), business.industry, Algorithmic learning theory, Online machine learning, Semi-supervised learning, Machine learning, computer.software_genre, Robot learning, Reinforcement learning, Unsupervised learning, Artificial intelligence, business, computer
Abstract: This chapter focuses on presenting some key concepts of machine learning, approximate dynamic programming, and the relationships between them. Discussion and comparisons are made based on various aspects of the two fields such as training information, behavioral variety, problem conversion, applicable tasks, and so forth. This chapter mentions many real-world examples to illustrate some of the important distinctions being made. The primary focus of this chapter is a discussion of the concepts and strategies of machine learning, not necessarily algorithmic details. This chapter provides high-level perspective on machine learning and approximate dynamic programming.
Published: 2009
Full Text: View/download PDF

8. Multiobjective Control Problems by Reinforcement Learning

Author: Jennie Si, Warren B. Powell, Don Wunsch, and Andrew G. Barto
Subjects: Dynamic programming, Mathematical optimization, Computer science, Convergence (routing), Reinforcement learning, Fuzzy control system, Control (linguistics), Multi-objective optimization
Abstract: Chapter 11 used hierarchical methods to solve multi-objective tasks. This chapter takes a different approach, using fuzzy control techniques. The mathematical background of multi-objective control and optimization is provided and a framework for an ADP algorithm with vector-valued rewards is introduced. Theoretical analyses are given to show certain convergence properties. A detailed algorithm implementation is presented, along with a cart-pole example.
Published: 2009
Full Text: View/download PDF

9. Control, Optimization, Security, and Selfhealing of Benchmark Power Systems - The views expressed here are those of the authors, Momoh (on leave from Howard University), and Zivi and not the official views of NSF and U.S. Naval Academy, USA

Author: Andrew G. Barto, Don Wunsch, Jennie Si, and Warren Buckler Powell
Subjects: Electric power system, Operations research, Quality of service, Political science, Reliability (computer networking), Benchmark (computing), Stability (learning theory), Reconfigurability, Electric power, Industrial engineering, Field (computer science)
Abstract: This chapter presents several challenging and benchmark problems from the field of power systems. The first benchmark is the IEEE 118 Bus commercial terrestrial Electrical Power System (EPS). The second benchmark represents a finite inertia hybrid ac/dc shipboard Integrated Power System (IPS). The analytic utility and Navy benchmark models and their respective simulations have been experimentally validated and have been used to determine system reliability, reconfigurability, stability, and security. The challenge is to provide novel control and optimization methods and tools to improve the quality of service despite natural and hostile disruptions under uncertain operating conditions. Along with these problems several smaller problems are also presented which demonstrate different aspects of the challenges of power system control. The purpose of this chapter is formulation of problems that ADP methods could be applied to, therefore the emphasis is more on detailed problem description and simulation, not on any particular solution.
Published: 2009
Full Text: View/download PDF

10. Supervised ActorCritic Reinforcement Learning

Author: Warren Buckler Powell, Andrew G. Barto, Jennie Si, and Don Wunsch
Subjects: Structure (mathematical logic), Supervisor, Learning classifier system, Computer science, business.industry, Intermittent control, Supervised learning, Machine learning, computer.software_genre, Unsupervised learning, Reinforcement learning, Artificial intelligence, business, computer, Robotic arm
Abstract: Chapter 7 introduced policy gradients as a way to improve on stochastic search of the policy space when learning. This chapter presents supervised actor-critic reinforcement learning as another method for improving the effectiveness of learning. With this approach, a supervisor adds structure to a learning problem and supervised learning makes that structure part of an actor-critic framework for reinforcement learning. Theoretical background and a detailed algorithm description are provided, along with several examples that contain enough detail to make them easy to understand and possible to duplicate. These examples also illustrate the use of two kinds of supervisors: a feedback controller that is easily designed yet suboptimal, and a human operator providing intermittent control of a simulated robotic arm.
Published: 2009
Full Text: View/download PDF

11. ModelBased Adaptive Critic Designs

Author: Jennie Si, Andrew G. Barto, Warren B. Powell, and Don Wunsch
Subjects: Heuristic dynamic programming, Perspective (geometry), business.industry, Adaptive critics, Control engineering, Artificial intelligence, Common framework, DUAL (cognitive architecture), business, Pseudocode, Implementation, Mathematics
Abstract: This chapter provides an overview of model-based adaptive critic designs including background, general algorithms, implementations, and comparisons. The authors begin by introducing the mathematical background of model-reference adaptive critic designs. Various ADP designs such as Heuristic Dynamic Programming (HDP), Dual HDP (DHP), Globalized DHP (GDHP), and Action-Dependent (AD) designs are examined from both a mathematical and implementation standpoint and put into perspective. Pseudocode is provided for many aspects of the algorithms. The chapter concludes with applications and examples. For another overview perspective that focuses more on implementation issues read Chapter 4: Guidance in the Use of Adaptive Critics for Control. Chapter 15 contains a comparison of DHP with back-propagation through time, building a common framework for comparing these methods.
Published: 2009
Full Text: View/download PDF

12. Adaptive Critic Based Neural Network for ControlConstrained Agile Missile

Author: Warren Buckler Powell, Don Wunsch, Andrew G. Barto, and Jennie Si
Subjects: Dynamic programming, Engineering, Air-to-air missile, Optimization problem, Missile, Artificial neural network, Control theory, business.industry, Control variable, Control engineering, business, Envelope (motion)
Abstract: This chapter uses the adaptive critic approach, which was introduced in Chapters 3 and 4, to steer an agile missile with bounds on the angle of attack (control variable) from various initial Mach numbers to a given final Mach number in minimum time while completely reversing its flight path angle. While a typical adaptive critic consists of a critic and controller, the agile missile problem needs chunking in terms of the independent control variable and, therefore, cascades of critics and controllers. Detailed derivations of equations and conditions on the constraint boundary are provided. For numerical experiments, the authors consider vertical plane scenarios. Numerical results demonstrate some attractive features of the adaptive critic approach and show that this formulation works very well in guiding the missile to its final conditions for this state constrained optimization problem from an envelope of initial conditions.
Published: 2009
Full Text: View/download PDF

13. Hierarchical Decision Making

Author: Andrew G. Barto, Don Wunsch, Jennie Si, and Warren B. Powell
Subjects: business.industry, Computer science, Control (management), Partially observable Markov decision process, Machine learning, computer.software_genre, Field (computer science), Task (project management), Reinforcement learning, Artificial intelligence, Markov decision process, Decision process, business, computer
Abstract: As the field of reinforcement learning has advanced, interest in solving realistic control problems has increased. However, Markov Decision Process (MDP) models have not proven sufficient to the task. This has led to increased use of Semi-Markov Decision Process models and the development of Hierarchical Reinforcement Learning (HRL). This chapter is an overview of HRL beginning with a discussion of the problems with the standard MDP models, then presenting the theory behind HRL, and finishing with some actual HRL algorithms that have been proposed. To see some examples of how hierarchical methods perform, see Chapter 11.
Published: 2009
Full Text: View/download PDF

14. Robust Reinforcement Learning for Heating, Ventilation, and Air Conditioning Control of Buildings

Author: Jennie Si, Warren B. Powell, Don Wunsch, and Andrew G. Barto
Subjects: Engineering, business.industry, Air conditioning, Control theory, HVAC, Control (management), Water cooling, Reinforcement learning, Control engineering, The Internet, Robust control, business, Simulation
Abstract: This chapter is a case study, implementing the technique presented in Chapter 13. A detailed problem formulation is presented for a heating and cooling system and a step-by-step solution is discussed. A combined PI and reinforcement learning controller is designed within a robust control framework and detailed simulation results are presented. An internet link is provided to a website where information on experiments is provided.
Published: 2009
Full Text: View/download PDF

15. Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability

Author: Jennie Si, Andrew G. Barto, Don Wunsch, and Warren Buckler Powell
Subjects: Structure (mathematical logic), Theoretical computer science, Computer science, business.industry, Concurrency, Process (computing), Probabilistic logic, Statistical model, Machine learning, computer.software_genre, Hierarchical database model, Task (computing), Observability, Artificial intelligence, business, computer
Abstract: In this chapter the authors summarize their research in hierarchical probabilistic models for decision making involving concurrent action, multiagent coordination, and hidden state estimation in stochastic environments. A hierarchical model for learning concurrent plans is first described for observable single agent domains, which combines compact state representations with temporal process abstractions to determine how to parallelize multiple threads of activity. A hierarchical model for multiagent coordination is then presented, where primitive joint actions and joint states are hidden. Here, high-level coordination is learned by exploiting overall task structure, which greatly speeds up convergence by abstracting from low-level steps that do not need to be synchronized. Finally, a hierarchical frame-work for hidden state estimation and action is presented, based on multi-resolution statistical modeling of the past history of observations and actions.
Published: 2009
Full Text: View/download PDF

16. NearOptimal Control Through Reinforcement Learning and Hybridization

Author: Andrew G. Barto, Jennie Si, Warren B. Powell, and Don Wunsch
Subjects: Mathematical optimization, Line search, Computer science, Control theory, Convergence (routing), State space, Reinforcement learning, Function (mathematics), Additive model, Fuzzy logic
Abstract: This chapter focuses on learning to act in a near-optimal manner through reinforcement learning for problems that either have no model or whose model is very complex. The emphasis here is on continuous action space (CAS) methods. Monte-Carlo approaches are employed to estimate function values in an iterative, incremental procedure. Derivative-free line search methods are used to find a near-optimal action in the continuous action space for a discrete subset of the state space. This near-optimal policy is then extended to the entire continuous state space using a fuzzy additive model. To compensate for approximation errors, a modified procedure for perturbing the generated control policy is developed. Convergence results, under moderate assumptions and stopping criteria, are established. References to sucessful applications of the controller are provided.
Published: 2009
Full Text: View/download PDF

17. Backpropagation Through Time and Derivative Adaptive CriticsA Common Framework for Comparison Portions of this chapter were previously published in [4, 7,9, 1214,23]

Author: Warren Buckler Powell, Jennie Si, Don Wunsch, and Andrew G. Barto
Subjects: Recurrent neural network, Derivative (finance), Adaptive critics, business.industry, Computer science, Heuristic programming, Backpropagation through time, Common framework, Artificial intelligence, DUAL (cognitive architecture), Pseudocode, business
Abstract: This chapter compares and contrasts derivative adaptive critics (DAC) such as dual heuristic programming (DHP), which was first introduced in Chapter 1 and also discussed in Chapter 3 with back-propagation through time (BPTT). A common framework is built and it is shown that both are techniques for determining the derivatives for training parameters in recurrent neural networks. This chapter goes into sufficient mathematical detail that the reader can understand the theoretical relationship between the two techniques. The author presents a hybrid technique that combines elements of both BPTT and DAC and provides detailed pseudocode. Computational issues and classes of challenging problems are discussed.
Published: 2009
Full Text: View/download PDF

18. Improved Temporal Difference Methods with Linear Function Approximation

Author: Jennie Si, Don Wunsch, Andrew G. Barto, and Warren Buckler Powell
Subjects: Dynamic programming, Mathematical optimization, Linear function (calculus), Function approximation, Bellman equation, Convergence (routing), Applied mathematics, Context (language use), Temporal difference learning, Least squares, Mathematics
Abstract: We consider temporal difference algorithms within the context of infinite-horizon finite-state dynamic programming problems with discounted cost, and linear cost function approximation. We show, under standard assumptions, that a least squares-based temporal difference method, proposed by Nedic and Bertsekas (NeB03), converges with a stepsize equal to 1. To our knowl- edge, this is the first iterative temporal difference method that converges without requiring a diminishing stepsize. We discuss the connections of the method with Sutton's TD(λ) and with various versions of least squares-based value iteration, and we show via analysis and experiment that the method is substantially and often dramatically faster than TD(λ), as well as simpler and more reliable. We also discuss the relation of our method with the LSTD method of Boyan (Boy02), and Bradtke and Barto (BrB96).
Published: 2009
Full Text: View/download PDF

20. [Untitled]

Author: Don Wunsch
Subjects: Cognitive science, Artificial neural network, Artificial Intelligence, business.industry, Computer science, Cognitive Neuroscience, Artificial intelligence, business
Published: 1992
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

20 results on '"Don Wunsch"'

1. Clustering

2. Guidance in the Use of Adaptive Critics for Control

3. The Linear Programming Approach to Approximate Dynamic Programming

4. ADP: Goals, Opportunities and Principles

5. Reinforcement Learning in Large, High-Dimensional State Spaces

6. Robust Reinforcement Learning Using IntegralQuadratic Constraints

7. Reinforcement Learning and Its Relationship to Supervised Learning

8. Multiobjective Control Problems by Reinforcement Learning

9. Control, Optimization, Security, and Selfhealing of Benchmark Power Systems - The views expressed here are those of the authors, Momoh (on leave from Howard University), and Zivi and not the official views of NSF and U.S. Naval Academy, USA

10. Supervised ActorCritic Reinforcement Learning

11. ModelBased Adaptive Critic Designs

12. Adaptive Critic Based Neural Network for ControlConstrained Agile Missile

13. Hierarchical Decision Making

14. Robust Reinforcement Learning for Heating, Ventilation, and Air Conditioning Control of Buildings

15. Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability

16. NearOptimal Control Through Reinforcement Learning and Hybridization

17. Backpropagation Through Time and Derivative Adaptive CriticsA Common Framework for Comparison Portions of this chapter were previously published in [4, 7,9, 1214,23]

18. Improved Temporal Difference Methods with Linear Function Approximation

19. Clustering

20. [Untitled]

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

20 results on '"Don Wunsch"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources