Descriptor: "Direct policy search" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Direct policy search"' showing total 72 results

Start Over Descriptor "Direct policy search"

72 results on '"Direct policy search"'

1. Stochastic Kriging-Based Optimization Applied in Direct Policy Search for Decision Problems in Infrastructure Planning

Author: Maia, Cibelle Dias de Carvalho Dantas, Lopez, Rafael Holdorf, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Haddar, Mohamed, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Kwon, Young W., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Xu, Jinyang, Editorial Board Member, and De Cursi, José Eduardo Souza, editor
Published: 2024
Full Text: View/download PDF

2. Connectivity of the Feasible and Sublevel Sets of Dynamic Output Feedback Control With Robustness Constraints

Author: Hu, Bin and Zheng, Yang
Subjects: Optimization landscape, sublevel set, direct policy search, H-infinity control, LOG control
Published: 2023

3. Incorporating learning into direct policy search for flood risk management.

Author: Wang, Jingya and Johnson, David R.
Subjects: FLOOD risk, FLOOD damage, OCEAN conditions (Weather), STORM surges, SEA level
Abstract: Direct policy search (DPS) is a method for identifying optimal policies (i.e., rules) for managing a system in response to changing conditions. In this article, we introduce a new adaptive way to incorporate learning into DPS. The standard DPS approach identifies "robust" policies by optimizing their average performance over a large ensemble of future states of the world (SOW). Our approach exploits information gained over time, updating prior beliefs about the kind of SOW being experienced. We first run the standard DPS approach multiple times, but with varying sets of weights applied to the SOWs when calculating average performance. Adaptive "metapolicies" then further improve performance by specifying how control of the system should switch between policies identified using different weight sets, depending on our updated beliefs about the relative likelihood of being in certain SOWs. We outline the general method and illustrate it using a case study of efficient dike heightening that simultaneously minimizes protection system costs and flood damage resulting from rising sea levels and storm surge. The solutions identified by our adaptive algorithm dominate the standard DPS on these two objectives, with an average marginal damage reduction of 35.1% for policies with similar costs; improvements are largest in SOWs with relatively lower sea level rise. We also evaluate how performance varies under different ways of implementing the algorithm, such as changing the frequency with which beliefs are updated. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Valuing the Codesign of Streamflow Forecast and Reservoir Operation Models.

Author: Yang, Guang, Giuliani, Matteo, and Galelli, Stefano
Subjects: *MUNICIPAL water supply, *FORECASTING, *SEARCH algorithms, *STREAMFLOW, *LONG-range weather forecasting
Abstract: Seasonal streamflow forecasts are becoming widely used to improve water reservoir operations, especially in areas where climate teleconnections enable predictability on medium and long lead times. Most existing studies have focused on the assimilation of forecasts into operational decision models, an approach that typically banks on predeveloped forecasts to optimize water release decisions. However, this approach may overlook the potential synergies that stand in co-developing forecast and decision-making models. In other words, the opportunities that lie in coupling both forecast and operational decision models have not yet been explored. Here, we address this gap and contribute a novel approach building on the Evolutionary Multi-Objective Direct Policy Search algorithm to design forecast and decision-making models together. The proposed approach is benchmarked against operating policies not informed by any forecast, as well as by forecast-informed policies relying on predeveloped forecasts (data-driven and perfect). Numerical experiments are conducted on the Angat-Umiray water resources system, Philippines, which is operated primarily for ensuring municipal water supply to Metro Manila and irrigation supply to a large agricultural district. Our results show that the integrated design of forecast models and control policies provides a performance gain with respect to policies informed by predesigned forecasts. This result is particularly interesting because the skill of the integrated forecast models is lower than that of the predeveloped ones, thus suggesting that more accurate forecasts do not necessarily produce better water system operations. Overall, our analysis represents a step towards a deeper integration of streamflow forecast and reservoir operation models. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

5. Multi-objective optimal design of interbasin water transfers: The Tagus-Segura aqueduct (Spain)

Author: Carlotta Valerio, Matteo Giuliani, Andrea Castelletti, Alberto Garrido, and Lucia De Stefano
Subjects: Interbasin water transfer, Tagus-Segura aqueduct, Multi-objective evolutionary optimization, Direct policy search, Environmental flow, Physical geography, GB3-5030, Geology, QE1-996.5
Abstract: Study region: The Tagus-Segura aqueduct (TSA) is a large and strategic water transfer scheme in Spain that connects Entrepeñas and Buendía reservoirs in the Tagus river headwaters to the Segura river basin, a highly stressed Mediterranean area. Study focus: The operating rules of the TSA underwent several modifications over the years, and the debate about which are the optimal parameters to meet the interests of the parties involved is still open. We employed Evolutionary Multi-Objective Direct Policy Search to jointly optimize the re-operation of the headwaters dams and the water transfer policy with respect to four conflicting objectives: Tagus and Segura water demands, hydropower production and socioeconomic benefit of the population living on the shores of the headwaters reservoirs. We tested the optimization under the baseline and the 2027 scenario, which foresees an increased environmental flow (EF) in the Tagus river. New hydrological insights for the region: The proposed operating rule presents optimized control parameters, a higher degree of freedom and a transferred volume that cyclically varies according to the hydrological stage of the year. In the 2027 scenario, despite the increased EF, the deficit in the aqueduct shows a limited increase compared to the historical solution (+10%), while the storage deficit is strongly reduced (−73%). This benefits the population living on the reservoirs shores and also ensures more stability to the aqueduct functioning.
Published: 2023
Full Text: View/download PDF

6. Quantifying the trade-offs in re-operating dams for the environment in the Lower Volta River.

Author: Owusu, Afua, Salazar, Jazmin Zatarain, Mul, Marloes, van der Zaag, Pieter, and Slinger, Jill
Abstract: The construction of the Akosombo and Kpong dams in the Lower Volta River Basin in Ghana changed the downstream riverine ecosystem and affected the lives of downstream communities, particularly those who lost their traditional livelihoods. In contrast to the costs borne by those in the vicinity of the river, Ghana as a whole, has enjoyed vast economic benefits from the affordable hydropower, irrigation schemes and lake tourism that developed after construction of the dams. Herein lies the challenge; there exists a trade-off between water for river ecosystems and related services on the one hand, and anthropogenic water demands such hydropower or irrigation on the other. In this study, an Evolutionary Multi-Objective Direct Policy Search (EMODPS) is used to identify the multi-sectorial trade-offs that exist in the Lower Volta River Basin. Three environmental flows, previously determined for the Lower Volta are incorporated separately as an environmental objective. The results highlight the dominance of hydropower production in the Lower Volta, but show that there is room for providing environmental flows under current climatic and water use conditions if firm energy requirement from Akosombo Dam reduces by 12% to 38% depending on the environmental flow regime that is implemented. There is uncertainty in climate change effects on runoff in this region, however multiple scenarios are investigated. It is found that climate change leading to increased annual inflows to the Akosombo Dam reduces the trade-off between hydropower and the environment while climate change resulting in lower inflows provide the opportunity to strategically provide dry season environmental flows, that is, reduce flows sufficiently to meet low flow requirements for key ecosystem services such as the clam fishery. This study not only highlights the challenges in balancing anthropogenic water demands and environmental considerations in managing existing dams, but also identifies opportunities for compromise in the Lower Volta River [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

7. Evaluating the choice of radial basis functions in multiobjective optimal control applications

Author: Zatarain Salazar, J. (author), Kwakkel, J.H. (author), Witvliet, Mark (author), Zatarain Salazar, J. (author), Kwakkel, J.H. (author), and Witvliet, Mark (author)
Abstract: Evolutionary Multi-Objective Direct Policy Search (EMODPS) is a prominent framework for designing control policies in multi-purpose environmental systems, combining direct policy search with multi-objective evolutionary algorithms (MOEAs) to identify Pareto approximate control policies. While EMODPS is effective, the choice of functions within its global approximator networks remains underexplored, despite their potential to significantly influence both solution quality and MOEA performance. This study conducts a rigorous assessment of a suite of Radial Basis Functions (RBFs) as candidates for these networks. We critically evaluate their ability to map system states to control actions, and assess their influence on Pareto efficient control policies. We apply this analysis to two contrasting case studies: the Conowingo Reservoir System, which balances competing water demands including hydropower, environmental flows, urban supply, power plant cooling, and recreation; and The Shallow Lake Problem, where a city navigates the trade-off between environmental and economic objectives when releasing anthropogenic phosphorus. Our findings reveal that the choice of RBF functions substantially impacts model outcomes. In complex scenarios like multi-objective reservoir control, this choice is critical, while in simpler contexts, such as the Shallow Lake Problem, the influence is less pronounced, though distinctive differences emerge in the characteristics of the prescribed control strategies., Policy Analysis
Published: 2024
Full Text: View/download PDF

8. Review of Research on Approximate Reinforcement Learning Algorithms.

Author: SI Yanna, PU Jiexin, and SUN Lifan
Subjects: REINFORCEMENT learning, MACHINE learning, LITERATURE reviews, ARTIFICIAL intelligence, ROBOT control systems
Abstract: Reinforcement learning (RL) is one of the most important techniques for artificial intelligence (AI). However, traditional tabular reinforcement learning is difficult to deal with control problems with large scale or continuous space. Approximate reinforcement learning is inspired by the idea of function approximation to parameterize the value function or strategy function, and obtains the optimal strategy indirectly through parameter optimization. It has been widely used in video games, Go game, robot control, etc. and obtained remarkable performance. In view of this, this paper reviews the research status and application progress of approximate reinforcement learning algorithms. Firstly, the basic theory of approximate reinforcement learning is introduced. Then the classical algorithms of approximate reinforcement learning are classified and expounded, including some corresponding improvement methods. Finally, the research progress of approximate reinforcement learning in robotics is summarized, and some major problems are summarized to provide reference for future research. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

9. From Stream Flows to Cash Flows: Leveraging Evolutionary Multi‐Objective Direct Policy Search to Manage Hydrologic Financial Risks.

Author: Hamilton, Andrew L., Characklis, Gregory W., and Reed, Patrick M.
Subjects: FINANCIAL risk, STREAMFLOW, CASH flow, FINANCIAL risk management, HEDGING (Finance)
Abstract: Hydrologic variability can present severe financial challenges for organizations that rely on water for the provision of services, such as water utilities and hydropower producers. While recent decades have seen rapid growth in decision‐support innovations aimed at helping utilities manage hydrologic uncertainty for multiple objectives, support for managing the related financial risks remains limited. However, the mathematical similarities between multi‐objective reservoir control and financial risk management suggest that the two problems can be approached in a similar manner. This paper demonstrates the utility of Evolutionary Multi‐Objective Direct Policy Search for developing adaptive policies for managing the drought‐related financial risk faced by a hydropower producer. These policies dynamically balance a portfolio, consisting of snowpack‐based financial hedging contracts, cash reserves, and debt, based on evolving system conditions. Performance is quantified based on four conflicting objectives, representing the classic tradeoff between "risk" and "return" in addition to decision‐makers' unique preferences toward different risk management instruments. The dynamic policies identified here significantly outperform static management formulations that are more typically employed for financial risk applications in the water resources literature. Additionally, this paper combines visual analytics and information theoretic sensitivity analysis to improve understanding about how different candidate policies achieve their comparative advantages through differences in how they adapt to real‐time information. The methodology presented in this paper should be applicable to any organization subject to financial risk stemming from hydrology or other environmental variables (e.g., wind speed, insolation), including electric utilities, water utilities, agricultural producers, and renewable energy developers. Key Points: Reservoir control and financial risk management share a common multi‐objective decision structure and can be optimized using similar methodsEvolutionary Multi‐Objective Direct Policy Search is used to develop financial risk management policies for a hydropower producerInformation theoretic sensitivity analysis and visual analytics are used to build intuition about how policies adapt to changing conditions [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

10. Policy Representation Learning for Multiobjective Reservoir Policy Design With Different Objective Dynamics.

Author: Zaniolo, Marta, Giuliani, Matteo, and Castelletti, Andrea
Subjects: WATER use, FEATURE selection, INFORMATION policy, INFORMATION design, DAM failures
Abstract: Most water reservoir operators make use of forecasts to inform their decisions and enhance water systems flexibility and resilience by anticipating hydrological extremes. Yet, despite numerous candidate hydro‐meteorological variables and forecast horizons may potentially be beneficial to operations, the best information set for a given problem is often not evident. Additionally, in multipurpose systems characterized by multiple demands with varying vulnerabilities and temporal scales, this information set might change according to the objective tradeoff. In this work, we contribute a novel method to learn the optimal policy representation (i.e., policy input set) by combining a feature selection routine with a multiobjective Direct Policy Search framework in order to retrieve the best policy input set online (i.e., while learning the policy) and dynamically with the objective trade‐off. The selected policy search routine is the Neuro‐Evolutionary Multi‐Objective Direct Policy Search (NEMODPS) which generates flexible policy shapes adaptive to online changes in the input set. This approach is demonstrated on the case study of Lake Como (Italy), where the operating objectives are highly heterogeneous in their dynamics (fast and slow) and vulnerabilities (wet and dry extremes). We show how varying objectives, and tradeoffs therein, benefit from a different policy representation, ultimately yielding remarkable results in terms of conflict mitigation between different users. More informed policies, moreover, show higher robustness when re‐evaluated across a suite of different hydrological conditions. Key Points: We introduce a novel method to define an optimal input set for a multipurpose dam operating policy that varies with the objective trade‐offBetter informed policies are able to mitigate conflicts between water users and achieve system‐wide benefitsThe addition of information in policy design increases the policies robustness toward extreme hydrological conditions [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

11. Adaptive mitigation strategies hedge against extreme climate futures.

Author: Marangoni, Giacomo, Lamontagne, Jonathan R., Quinn, Julianne D., Reed, Patrick M., and Keller, Klaus
Abstract: The United Nations Framework Convention on Climate Change agreed to “strengthen the global response to the threat of climate change, in the context of sustainable development and efforts to eradicate poverty” (UNFCCC 2015). Designing a global mitigation strategy to support this goal poses formidable challenges. For one, there are trade-offs between the economic costs and the environmental benefits of averting climate impacts. Furthermore, the coupled human-Earth systems are subject to deep and dynamic uncertainties. Previous economic analyses typically addressed either the former, introducing multiple objectives, or the latter, making mitigation actions responsive to new information. This paper aims at bridging these two separate strands of literature. We demonstrate how information feedback from observed global temperature changes can jointly improve the economic and environmental performance of mitigation strategies. We focus on strategies that maximize discounted expected utility while also minimizing warming above 2 °C, damage costs, and mitigation costs. Expanding on the Dynamic Integrated Climate-Economy (DICE) model and previous multi-objective efforts, we implement closed-loop control strategies, map the emerging trade-offs and quantify the value of the temperature information feedback under both well-characterized and deep climate uncertainties. Adaptive strategies strongly reduce high regrets, guarding against mitigation overspending for less sensitive climate futures, and excessive warming for more sensitive ones. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

12. Direct Policy Search Reinforcement Learning Based on Variational Bayesian Inference.

Author: Yamaguchi, Nobuhiko
Subjects: *REINFORCEMENT learning, *BAYESIAN analysis, *COMPUTER algorithms, *PARAMETER estimation, *GOVERNMENT policy
Abstract: Direct policy search is a promising reinforcement learning framework particularly for controlling continuous, high-dimensional systems. Peters et al. proposed reward-weighted regression (RWR) as a direct policy search. The RWR algorithm estimates the policy parameter based on the expectation-maximization (EM) algorithm and is therefore prone to overfitting. In this study, we focus on variational Bayesian inference to avoid overfitting and propose direct policy search reinforcement learning based on variational Bayesian inference (VBRL). The performance of the proposed VBRL is assessed in several experiments involving a mountain car and a ball batting task. These experiments demonstrate that VBRL yields a higher average return and outperforms the RWR. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

13. RL-Cache: Learning-Based Cache Admission for Content Delivery.

Author: Kirilin, Vadim, Sundarrajan, Aditya, Gorinsky, Sergey, and Sitaraman, Ramesh K.
Subjects: MONTE Carlo method, CONTENT delivery networks, ALGORITHMS, INTERNET content, INTERNET traffic, REINFORCEMENT learning
Abstract: Content delivery networks (CDNs) distribute much of the Internet content by caching and serving the objects requested by users. A major goal of a CDN is to maximize the hit rates of its caches, thereby enabling faster content downloads to the users. Content caching involves two components: an admission algorithm to decide whether to cache an object and an eviction algorithm to determine which object to evict from the cache when it is full. In this paper, we focus on cache admission and propose a novel algorithm called RL-Cache that uses model-free reinforcement learning (RL) to decide whether or not to admit a requested object into the CDN’s cache. Unlike prior approaches that use a small set of criteria for decision making, RL-Cache weights a large set of features that include the object size, recency, and frequency of access. We develop a publicly available implementation of RL-Cache and perform an evaluation using production traces for the image, video, and web traffic classes from Akamai’s CDN. The evaluation shows that RL-Cache improves the hit rate in comparison with the state of the art and imposes only a modest resource overhead on the CDN servers. Further, RL-Cache is robust enough that it can be trained in one location and executed on request traces of the same or different traffic classes in other locations of the same geographic region. The paper also reports extensive analyses of the RL-Cache sensitivity to its features and hyperparameter values. The analyses validate the made design choices and reveal interesting insights into the RL-Cache behavior. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

14. Performance of Implicit Stochastic Approaches to the Synthesis of Multireservoir Operating Rules.

Author: Guariso, Giorgio and Sangiorgio, Matteo
Subjects: *WATER supply, *WATER pressure, *WATER storage, *WATER power
Abstract: With increasing pressure on water resources availability and dependability and constraints due to environmental concerns, the traditional approaches for defining reservoir management rules are often inadequate. In particular, in multireservoir systems, when multiple input variables (e.g., the storage of other reservoirs in the system, water demand in different districts) must be taken into account, it is almost impossible to figure out which shape the operating rule(s) could have. For these reasons, neural network (NN) based rules have been increasingly adopted in the last decade. NN-based rules are well known as universal approximators that can help determine the most interesting input variables, their mutual relations, and how they contribute to the definition of the optimal releases. Two approaches to the identification of neural management rules are discussed in the paper. The first solves a deterministic open-loop (i.e., with known inflows) problem and then identifies neural closed-loop policies using the classical regression method, so that the rules approximate as much as possible the solution found in the first step. The second approach, direct policy search, assumes that the operating rule is represented by an NN, the parameters of which are optimized directly by solving the optimal closed-loop problem. This work applies the two approaches to the case of the downstream portion of the Nile River basin system, which contains some large reservoirs, and for which several years of synthetic streamflows are available. The comparison of the two approaches highlights intrinsic differences, showing the benefits and disadvantages of each. In the specific case of the Nile, the first approach performs better in terms of global agricultural deficit and hydropower production. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

15. Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage.

Author: Moazeni, Somayeh, Scott, Warren R., and Powell, Warren B.
Subjects: LEAST squares, ENERGY storage, DYNAMIC programming, INSTRUMENTAL variables (Statistics), OPERATIONS management
Abstract: This article studies least-squares approximate policy iteration (API) methods with parametrized value-function approximation. We study several variations of the policy evaluation phase, namely, Bellman error minimization, Bellman error minimization with instrumental variables, projected Bellman error minimization, and projected Bellman error minimization with instrumental variables. For a general discrete-time stochastic control problem, Bellman error minimization policy evaluation using instrumental variables is equivalent to both variants of the projected Bellman error minimization. An alternative to these API methods is direct policy search based on knowledge gradient. The practical performance of these three approximate dynamic programming methods, (i) least squares API with Bellman error minimization, (ii) least squares API with Bellman error minimization with instrumental variables, and (iii) direct policy search, are investigated in the context of an application in energy storage operations management. We create a library of test problems using real-world data and apply value iteration to find their optimal policies. These optimal benchmarks are then used to compare the developed approximate dynamic programming policies. Our analysis indicates that least-squares API with instrumental variables Bellman error minimization prominently outperforms least-squares API with Bellman error minimization. However, these approaches underperform our direct policy search implementation. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

16. Evaluating the choice of radial basis functions in multiobjective optimal control applications.

Author: Zatarain Salazar, Jazmin, Kwakkel, Jan H., and Witvliet, Mark
Subjects: *RADIAL basis functions, *EVOLUTIONARY algorithms, *WATER management, *WATER power
Abstract: Evolutionary Multi-Objective Direct Policy Search (EMODPS) is a prominent framework for designing control policies in multi-purpose environmental systems, combining direct policy search with multi-objective evolutionary algorithms (MOEAs) to identify Pareto approximate control policies. While EMODPS is effective, the choice of functions within its global approximator networks remains underexplored, despite their potential to significantly influence both solution quality and MOEA performance. This study conducts a rigorous assessment of a suite of Radial Basis Functions (RBFs) as candidates for these networks. We critically evaluate their ability to map system states to control actions, and assess their influence on Pareto efficient control policies. We apply this analysis to two contrasting case studies: the Conowingo Reservoir System, which balances competing water demands including hydropower, environmental flows, urban supply, power plant cooling, and recreation; and The Shallow Lake Problem, where a city navigates the trade-off between environmental and economic objectives when releasing anthropogenic phosphorus. Our findings reveal that the choice of RBF functions substantially impacts model outcomes. In complex scenarios like multi-objective reservoir control, this choice is critical, while in simpler contexts, such as the Shallow Lake Problem, the influence is less pronounced, though distinctive differences emerge in the characteristics of the prescribed control strategies. • RBF choice in EMODPS impacts tradeoffs and policies in multiobjective control. • Lake Problem: RBFs affect control policies, not objective values. • Concave RBFs excel in complex EMODPS, like Conowingo Reservoir. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Incorporating learning into direct policy search for flood risk management.

Author: Wang J and Johnson DR
Abstract: Direct policy search (DPS) is a method for identifying optimal policies (i.e., rules) for managing a system in response to changing conditions. In this article, we introduce a new adaptive way to incorporate learning into DPS. The standard DPS approach identifies "robust" policies by optimizing their average performance over a large ensemble of future states of the world (SOW). Our approach exploits information gained over time, updating prior beliefs about the kind of SOW being experienced. We first run the standard DPS approach multiple times, but with varying sets of weights applied to the SOWs when calculating average performance. Adaptive "metapolicies" then further improve performance by specifying how control of the system should switch between policies identified using different weight sets, depending on our updated beliefs about the relative likelihood of being in certain SOWs. We outline the general method and illustrate it using a case study of efficient dike heightening that simultaneously minimizes protection system costs and flood damage resulting from rising sea levels and storm surge. The solutions identified by our adaptive algorithm dominate the standard DPS on these two objectives, with an average marginal damage reduction of 35.1% for policies with similar costs; improvements are largest in SOWs with relatively lower sea level rise. We also evaluate how performance varies under different ways of implementing the algorithm, such as changing the frequency with which beliefs are updated., (© 2023 The Authors. Risk Analysis published by Wiley Periodicals LLC on behalf of Society for Risk Analysis.)
Published: 2024
Full Text: View/download PDF

18. On the Value of ENSO State for Urban Water Supply System Operators: Opportunities, Trade‐Offs, and Challenges.

Author: Libisch‐Lehner, C. P., Nachtnebel, H. P., Nguyen, H. T. T., Taormina, R., and Galelli, S.
Subjects: EL Nino, WATER supply, URBAN geography
Abstract: The El Niño Southern Oscillation (ENSO) is a major driver of global hydro‐climatic variability, with well‐known effects on floods, droughts, and coupled human‐natural systems. Its impact on urban settlements depends on both level of exposure and preparedness; two factors that are responsible for severe cuts on millions of people in developing countries, where urban water supply relies almost entirely on rainfall‐dependent sources. To understand whether information on the ENSO state could help mitigate the effects of droughts, we use Metro Manila's water supply system as exemplifying case study, for which we design "traditional" and adaptive management policies. The former are based on information typically available to operators, such as reservoir storage; the latter complement such information with the Oceanic Niño Index (ONI)—an indicator used for monitoring El Niño and La Niña state. Results obtained by comparing the policy performance on a large set of stochastic streamflow and ONI replicates show that ENSO‐informed policies are more robust, meaning that they attain a minimum performance level across a broader set of replicates. We show that the primary cause of this behavior is the information on the ENSO state. To further quantify the value of the ONI, we then compare the performance of a representative ENSO‐informed policy and the system's current operating rules on the period 1968–2014. The comparison shows that the severe water supply restrictions caused by the existing management system could have been partially avoided through a sequence of smaller restrictions implemented at the onset of the main El Niño events. Key Points: A computational framework to assess ENSO impact on urban water supplyENSO‐informed operating policies increase system's robustnessInformation on ENSO could reduce regrets in the present operations [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

19. Sparse Gradient-Based Direct Policy Search

Author: Sokolovska, Nataliya, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Huang, Tingwen, editor, Zeng, Zhigang, editor, Li, Chuandong, editor, and Leung, Chi Sing, editor
Published: 2012
Full Text: View/download PDF

20. Convergence to the Globally Optimal Controller

Author: Sanfelice Bazanella, Alexandre, Campestrini, Lucíola, Eckhard, Diego, Sanfelice Bazanella, Alexandre, Campestrini, Lucíola, and Eckhard, Diego
Published: 2012
Full Text: View/download PDF

21. Using direct policy search to identify robust strategies in adapting to uncertain sea-level rise and storm surge.

Author: Garner, Gregory G. and Keller, Klaus
Subjects: *SEA level, *STORM surges, *COASTAL ecology, *CLIMATE change, *INFRASTRUCTURE (Economics)
Abstract: Sea-level rise poses considerable risks to coastal communities, ecosystems, and infrastructure. Decision makers are faced with uncertain sea-level projections when designing a strategy for coastal adaptation. The traditional methods are often silent on tradeoffs as well as the effects of tail-area events and of potential future learning. Here we reformulate a simple sea-level rise adaptation model to address these concerns. We show that Direct Policy Search yields improved solution quality, with respect to Pareto-dominance in the objectives, over the traditional approach under uncertain sea-level rise projections and storm surge. Additionally, the new formulation produces high quality solutions with less computational demands than an intertemporal optimization approach. Our results illustrate the utility of multi-objective adaptive formulations for the example of coastal adaptation and point to wider-ranging application in climate change adaptation decision problems. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

22. Balancing exploration, uncertainty and computational demands in many objective reservoir optimization.

Author: Zatarain Salazar, Jazmin, Reed, Patrick M., Quinn, Julianne D., Giuliani, Matteo, and Castelletti, Andrea
Subjects: *WATERSHED management, *HEURISTIC, *COMPUTER algorithms, *WATER resources development, *STREAM measurements
Abstract: Reservoir operations are central to our ability to manage river basin systems serving conflicting multi-sectoral demands under increasingly uncertain futures. These challenges motivate the need for new solution strategies capable of effectively and efficiently discovering the multi-sectoral tradeoffs that are inherent to alternative reservoir operation policies. Evolutionary many-objective direct policy search (EMODPS) is gaining importance in this context due to its capability of addressing multiple objectives and its flexibility in incorporating multiple sources of uncertainties. This simulation-optimization framework has high potential for addressing the complexities of water resources management, and it can benefit from current advances in parallel computing and meta-heuristics. This study contributes a diagnostic assessment of state-of-the-art parallel strategies for the auto-adaptive Borg Multi Objective Evolutionary Algorithm (MOEA) to support EMODPS. Our analysis focuses on the Lower Susquehanna River Basin (LSRB) system where multiple sectoral demands from hydropower production, urban water supply, recreation and environmental flows need to be balanced. Using EMODPS with different parallel configurations of the Borg MOEA, we optimize operating policies over different size ensembles of synthetic streamflows and evaporation rates. As we increase the ensemble size, we increase the statistical fidelity of our objective function evaluations at the cost of higher computational demands. This study demonstrates how to overcome the mathematical and computational barriers associated with capturing uncertainties in stochastic multiobjective reservoir control optimization, where parallel algorithmic search serves to reduce the wall-clock time in discovering high quality representations of key operational tradeoffs. Our results show that emerging self-adaptive parallelization schemes exploiting cooperative search populations are crucial. Such strategies provide a promising new set of tools for effectively balancing exploration, uncertainty, and computational demands when using EMODPS. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

23. Exploring global approximators for multiobjective reservoir control

Author: Zatarain Salazar, J. (author), Kwakkel, J.H. (author), Witvliet, Mark (author), Zatarain Salazar, J. (author), Kwakkel, J.H. (author), and Witvliet, Mark (author)
Abstract: Efficient multi-purpose reservoir control policies are crucial in the face of frequent and severe floods and droughts, and to balance water allocation across conflicting demands. Evolutionary Multi-Objective Direct Policy Search (EMODPS) is a popular approach to design control policies for multi-purpose reservoir systems. EMODPS, however, relies on experimental choices within the key components of the framework particularly when coupling multi-objective evolutionary optimization with nonlinear approximation networks. This study explores a suite of radial basis functions (RBFs) used to map the system's states to control actions in a flexible manner as time-varying, non-linear relationships. We provide a systematic assessment of different RBF functions to explore their suitability to obtain Pareto efficient control policies. We use the Susquehanna river basin case study in which competing water demands for hydropower, environment, urban water supply, atomic power plant cooling and recreation need to be met. Our findings suggest that the choice of RBF functions have a large impact on the model outcomes and the search behavior of the optimization algorithm., Policy Analysis
Published: 2022
Full Text: View/download PDF

24. Some properties of nonlinear Lanchester equations with an application in military.

Author: Kim, Donghyun, Moon, Hyungil, and Shin, Hayong
Subjects: *NETWORK-centric operations (Military science), *LANCHESTER automobiles, *NONLINEAR systems, *DECISION making, *APPLICATION software, *DIFFERENTIAL equations
Abstract: There have been many research literature on traditional direct fire combat modelling. Recently, network centric warfare (NCW) is an active research topic, in which information plays more important role than in the traditional warfare. It can be easily agreed that the use of information affects the combat results greatly. However, it is not straightforward to measure the effect of the information, thus decision making involving the impact of information during combat is a non-trivial task. In this study, we propose a simple model for NCW modified from the original Lanchester differential equation, which can be used as a basic model for analysing characteristics of NCW. We derive some useful properties of the model in a special case. In order to solve the optimal fire allocation decision under this model in general case, we propose an algorithm based on reinforcement learning, followed by numerical examples. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

25. Direct policy search for robust multi-objective management of deeply uncertain socio-ecological tipping points.

Author: Quinn, Julianne D., Reed, Patrick M., and Keller, Klaus
Subjects: *ENVIRONMENTAL policy, *ECOSYSTEMS, *SOCIAL systems, *PHOSPHORUS, *POLLUTION control industry, *DECISION making
Abstract: Managing socio-ecological systems is a challenge wrought by competing societal objectives, deep uncertainties, and potentially irreversible tipping points. A classic, didactic example is the shallow lake problem in which a hypothetical town situated on a lake must develop pollution control strategies to maximize its economic benefits while minimizing the probability of the lake crossing a critical phosphorus (P) threshold, above which it irreversibly transitions into a eutrophic state. Here, we explore the use of direct policy search (DPS) to design robust pollution control rules for the town that account for deeply uncertain system characteristics and conflicting objectives. The closed loop control formulation of DPS improves the quality and robustness of key management tradeoffs, while dramatically reducing the computational complexity of solving the multi-objective pollution control problem relative to open loop control strategies. These insights suggest DPS is a promising tool for managing socio-ecological systems with deeply uncertain tipping points. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

26. Connectivity of the Feasible and Sublevel Sets of Dynamic Output Feedback Control with Robustness Constraints

Author: Bin Hu and Yang Zheng
Subjects: direct policy search, Control and Optimization, Control and Systems Engineering, Optimization and Control (math.OC), LOG control, H-infinity control, FOS: Mathematics, Optimization landscape, sublevel set, Mathematics - Optimization and Control
Abstract: This paper considers the optimization landscape of linear dynamic output feedback control with $\mathcal{H}_\infty$ robustness constraints. We consider the feasible set of all the stabilizing full-order dynamical controllers that satisfy an additional $\mathcal{H}_\infty$ robustness constraint. We show that this $\mathcal{H}_\infty$-constrained set has at most two path-connected components that are diffeomorphic under a mapping defined by a similarity transformation. Our proof technique utilizes a classical change of variables in $\mathcal{H}_\infty$ control to establish a subjective mapping from a set with a convex projection to the $\mathcal{H}_\infty$-constrained set. This proof idea can also be used to establish the same topological properties of strict sublevel sets of linear quadratic Gaussian (LQG) control and optimal $\mathcal{H}_\infty$ control. Our results bring positive news for gradient-based policy search on robust control problems., Comment: Submitted to L-CSS and CDC 2022
Published: 2022
Full Text: View/download PDF

27. Exploring global approximators for multiobjective reservoir control

Author: Jazmin Zatarain Salazar, Jan Kwakkel, and Mark Witvliet
Subjects: direct policy search, Control and Systems Engineering, global approximators, Optimal operation of water resources systems
Abstract: Efficient multi-purpose reservoir control policies are crucial in the face of frequent and severe floods and droughts, and to balance water allocation across conflicting demands. Evolutionary Multi-Objective Direct Policy Search (EMODPS) is a popular approach to design control policies for multi-purpose reservoir systems. EMODPS, however, relies on experimental choices within the key components of the framework particularly when coupling multi-objective evolutionary optimization with nonlinear approximation networks. This study explores a suite of radial basis functions (RBFs) used to map the system's states to control actions in a flexible manner as time-varying, non-linear relationships. We provide a systematic assessment of different RBF functions to explore their suitability to obtain Pareto efficient control policies. We use the Susquehanna river basin case study in which competing water demands for hydropower, environment, urban water supply, atomic power plant cooling and recreation need to be met. Our findings suggest that the choice of RBF functions have a large impact on the model outcomes and the search behavior of the optimization algorithm.
Published: 2022

28. Season-Dependent Hedging Policies for Reservoir Operation—A Comparison Study

Author: Nikhil Bhatia, Roshan Srivastav, and Kasthrirengan Srinivasan
Subjects: parameterization, simulation, optimization, direct policy search, hedging policy, shortage ratio: Vulnerability, NSGA-II, Hydraulic engineering, TC1-978, Water supply for domestic and industrial purposes, TD201-500
Abstract: During periods of significant water shortage or when drought is impending, it is customary to implement some kind of water supply reduction measures with a view to prevent the occurrence of severe shortages (vulnerability) in the near future. In the case of operation of a water supply reservoir, this reduction of water supply is affected by hedging schemes or hedging policies. This research work aims to compare the popular hedging policies: (i) linear two-point hedging; (ii) modified two-point hedging; and, (iii) discrete hedging based on time-varying and constant hedging parameters. A parameterization-simulation-optimization (PSO) framework is employed for the selection of the parameters of the compromising hedging policies. The multi-objective evolutionary search-based technique (Non-dominated Sorting based Genetic Algorithm-II) was used to identify the Pareto-optimal front of hedging policies that seek to obtain the trade-off between shortage ratio and vulnerability. The case example used for illustration is the Hemavathy reservoir in Karnataka, India. It is observed that the Pareto-optimal front that was obtained from time-varying hedging policies show significant improvement in reservoir performance when compared to constant hedging policies. The variation in the monthly parameters of the time-variant hedging policies shows a strong correlation with monthly inflows and available water.
Published: 2018
Full Text: View/download PDF

29. A diagnostic assessment of evolutionary algorithms for multi-objective surface water reservoir control.

Author: Zatarain Salazar, Jazmin, Reed, Patrick M., Herman, Jonathan D., Giuliani, Matteo, and Castelletti, Andrea
Subjects: *EVOLUTIONARY algorithms, *RESERVOIRS, *CLIMATE change, *WATERSHEDS, *PARETO analysis
Abstract: Globally, the pressures of expanding populations, climate change, and increased energy demands are motivating significant investments in re-operationalizing existing reservoirs or designing operating policies for new ones. These challenges require an understanding of the tradeoffs that emerge across the complex suite of multi-sector demands in river basin systems. This study benchmarks our current capabilities to use Evolutionary Multi-Objective Direct Policy Search (EMODPS), a decision analytic framework in which reservoirs’ candidate operating policies are represented using parameterized global approximators (e.g., radial basis functions) then those parameterized functions are optimized using multi-objective evolutionary algorithms to discover the Pareto approximate operating policies. We contribute a comprehensive diagnostic assessment of modern MOEAs’ abilities to support EMODPS using the Conowingo reservoir in the Lower Susquehanna River Basin, Pennsylvania, USA. Our diagnostic results highlight that EMODPS can be very challenging for some modern MOEAs and that epsilon dominance, time-continuation, and auto-adaptive search are helpful for attaining high levels of performance. The ϵ-MOEA, the auto-adaptive Borg MOEA, and ϵ-NSGAII all yielded superior results for the six-objective Lower Susquehanna benchmarking test case. The top algorithms show low sensitivity to different MOEA parameterization choices and high algorithmic reliability in attaining consistent results for different random MOEA trials. Overall, EMODPS poses a promising method for discovering key reservoir management tradeoffs; however algorithmic choice remains a key concern for problems of increasing complexity. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

30. Curses, Tradeoffs, and Scalable Management: Advancing Evolutionary Multiobjective Direct Policy Search to Improve Water Reservoir Operations.

Author: Giuliani, Matteo, Castelletti, Andrea, Pianosi, Francesca, Mason, Emanuele, and Reed, Patrick M.
Subjects: *RESERVOIRS, *WATER distribution, *WATER-supply engineering, *WATER quality, *NATURAL resources management
Abstract: Optimal management policies for water reservoir operation are generally designed via stochastic dynamic programming (SDP). Yet, the adoption of SDP in complex real-world problems is challenged by the three curses of dimensionality, modeling, and multiple objectives. These three curses considerably limit SDP's practical application. Alternatively, this study focuses on the use of evolutionary multiobjective direct policy search (EMODPS), a simulation-based optimization approach that combines direct policy search, nonlinear approximating networks, and multiobjective evolutionary algorithms to design Pareto-approximate closed-loop operating policies for multipurpose water reservoirs. This analysis explores the technical and practical implications of using EMODPS through a careful diagnostic assessment of the effectiveness and reliability of the overall EMODPS solution design as well as of the resulting Pareto-approximate operating policies. The EMODPS approach is evaluated using the multipurpose Hoa Binh water reservoir in Vietnam, where water operators are seeking to balance the conflicting objectives of maximizing hydropower production and minimizing flood risks. A key choice in the EMODPS approach is the selection of alternative formulations for flexibly representing reservoir operating policies. This study distinguishes between the relative performance of two widely-used nonlinear approximating networks, namely artificial neural networks (ANNs) and radial basis functions (RBFs). The results show that RBF solutions are more effective than ANN ones in designing Pareto approximate policies for the Hoa Binh reservoir. Given the approximate nature of EMODPS, the diagnostic benchmarking uses SDP to evaluate the overall quality of the attained Pareto-approximate results. Although the Hoa Binh test case's relative simplicity should maximize the potential value of SDP, the results demonstrate that EMODPS successfully dominates the solutions derived via SDP. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

31. Distribution of waiting time for dynamic pickup and delivery problems.

Author: Vonolfen, Stefan and Affenzeller, Michael
Subjects: *DISTRIBUTION (Probability theory), *EXPRESS service (Delivery of goods), *PASSENGERS, *HEURISTIC algorithms, *SIMULATION methods & models
Abstract: Pickup and delivery problems have numerous applications in practice such as parcel delivery and passenger transportation. In the dynamic variant of the problem, not all information is available in advance but is revealed during the planning process. Thus, it is crucial to anticipate future events in order to generate high-quality solutions. Previous work has shown that the use of waiting strategies has the potential to save costs and maximize service quality. We adapt various waiting heuristics to the pickup and delivery problem with time windows. Previous research has shown, that specialized waiting heuristics utilizing anticipatory knowledge potentially outperform general heuristics. Direct policy search based on evolutionary computation and a simulation model is proposed as a methodology to automatically specialize waiting strategies to different problem characteristics. Based on the strengths of the previously introduced waiting strategies, we propose a novel waiting heuristic that can utilize historical request information based on an intensity measure which does not require an additional data preprocessing step. The performance of the waiting heuristics is evaluated on a single set of benchmark instances containing various instance classes that differ in terms of spatial and temporal properties. The diverse set of benchmark instances is used to analyze the influence of spatial and temporal instance properties as well as the degree of dynamism to the potential savings that can be achieved by anticipatory waiting and the incorporation of knowledge about future requests. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

32. A genetic fuzzy system for interpretable and parsimonious reinforcement learning policies

Author: Chicano, Francisco, Bishop, Jordan T., Gallagher, Marcus, Browne, Will N., Chicano, Francisco, Bishop, Jordan T., Gallagher, Marcus, and Browne, Will N.
Abstract: Reinforcement learning (RL) is experiencing a resurgence in research interest, where Learning Classifier Systems (LCSs) have been applied for many years. However, traditional Michigan approaches tend to evolve large rule bases that are difficult to interpret or scale to domains beyond standard mazes. A Pittsburgh Genetic Fuzzy System (dubbed Fuzzy MoCoCo) is proposed that utilises both multiobjective and cooperative coevolutionary mechanisms to evolve fuzzy rule-based policies for RL environments. Multiobjectivity in the system is concerned with policy performance vs. complexity. The continuous state RL environment Mountain Car is used as a testing bed for the proposed system. Results show the system is able to effectively explore the trade-off between policy performance and complexity, and learn interpretable, high-performing policies that use as few rules as possible.
Published: 2021

33. Multiobjective direct policy search using physically based operating rules in multireservoir systems

Author: Universitat Politècnica de Catalunya. Doctorat en Enginyeria Civil, Universitat Politècnica de Catalunya. CRAHI - Centre de Recerca Aplicada en Hidrometeorologia, Ritter, Josias Manuel Gisbert, Corzo, Gerald, Solomatine, Dimitri P., Angarita, Héctor, Universitat Politècnica de Catalunya. Doctorat en Enginyeria Civil, Universitat Politècnica de Catalunya. CRAHI - Centre de Recerca Aplicada en Hidrometeorologia, Ritter, Josias Manuel Gisbert, Corzo, Gerald, Solomatine, Dimitri P., and Angarita, Héctor
Abstract: supplemental_data_wr.1943-5452.0001159_ritter.pdf (492 KB), This study explores the ways to introduce physical interpretability into the process of optimizing operating rules for multireservoir systems with multiple objectives. Prior studies applied the concept of direct policy search (DPS), in which the release policy is expressed as a set of parameterized functions (e.g., neural networks) that are optimized by simulating the performance of different parameter value combinations over a testing period. The problem with this approach is that the operators generally avoid adopting such artificial black-box functions for the direct real-time control of their systems, preferring simpler tools with a clear connection to the system physics. This study addresses this mismatch by replacing the black-box functions in DPS with physically based parameterized operating rules, for example by directly using target levels in dams as decision variables. This leads to results that are physically interpretable and may be more acceptable to operators. The methodology proposed in this work is applied to a network of five reservoirs and four power plants in the Nechi catchment in Colombia, with four interests involved: average energy generation, firm energy generation, flood hazard, and flow regime alteration. The release policy is expressed depending on only 12 parameters, which significantly reduces the computational complexity compared to existing approaches of multiobjective DPS. The resulting four-dimensional Pareto-approximate set offers a variety of operational strategies from which operators may choose one that corresponds best to their preferences. For demonstration purposes, one particular optimized policy is selected and its parameter values are analyzed to illustrate how the physically based operating rules can be directly interpreted by the operators., Peer Reviewed, Preprint
Published: 2020

34. Adaptive mitigation strategies hedge against extreme climate futures

Author: Giacomo Marangoni, Patrick M. Reed, Jonathan R. Lamontagne, Klaus Keller, and J. Quinn
Subjects: Atmospheric Science, Adaptive strategies, 010504 meteorology & atmospheric sciences, Adaptive mitigation pathways, Climate change, Context (language use), Climate risk management, Adaptive mitigation pathways, Integrated assessment modelling, Multi-objective optimization, Direct policy search, 01 natural sciences, 12. Responsible consumption, United Nations Framework Convention on Climate Change, 0502 economics and business, 11. Sustainability, Direct policy search, 050207 economics, Climate risk management, Integrated assessment modelling, 0105 earth and related environmental sciences, Sustainable development, Global and Planetary Change, 05 social sciences, 1. No poverty, Environmental economics, Multi-objective optimization, 13. Climate action, Business, Futures contract
Abstract: The United Nations Framework Convention on Climate Change agreed to “strengthen the global response to the threat of climate change, in the context of sustainable development and efforts to eradicate poverty” (UNFCCC 2015). Designing a global mitigation strategy to support this goal poses formidable challenges. For one, there are trade-offs between the economic costs and the environmental benefits of averting climate impacts. Furthermore, the coupled human-Earth systems are subject to deep and dynamic uncertainties. Previous economic analyses typically addressed either the former, introducing multiple objectives, or the latter, making mitigation actions responsive to new information. This paper aims at bridging these two separate strands of literature. We demonstrate how information feedback from observed global temperature changes can jointly improve the economic and environmental performance of mitigation strategies. We focus on strategies that maximize discounted expected utility while also minimizing warming above 2 °C, damage costs, and mitigation costs. Expanding on the Dynamic Integrated Climate-Economy (DICE) model and previous multi-objective efforts, we implement closed-loop control strategies, map the emerging trade-offs and quantify the value of the temperature information feedback under both well-characterized and deep climate uncertainties. Adaptive strategies strongly reduce high regrets, guarding against mitigation overspending for less sensitive climate futures, and excessive warming for more sensitive ones.
Published: 2021

35. Reinforcement Learning with Rare Significant Events: Direct Policy Search vs. Gradient Policy Search

Author: Nicolas Fontbonne, Jean-Baptiste André, Paul Ecoffet, Nicolas Bredeche, Bredeche, Nicolas, Sorbonne Université (SU), Institut Jean-Nicod (IJN), Département d'Etudes Cognitives - ENS Paris (DEC), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Collège de France (CdF (institution))-Centre National de la Recherche Scientifique (CNRS)-Département de Philosophie - ENS Paris, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], PPO, Computer science, gradient policy search, Evolutionary algorithm, [INFO.INFO-NE] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], 0102 computer and information sciences, 02 engineering and technology, [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], Machine learning, computer.software_genre, 01 natural sciences, Task (project management), [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], continuous state and action spaces, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], 0202 electrical engineering, electronic engineering, information engineering, Reinforcement learning, evolutionary algorithms, ComputingMilieux_MISCELLANEOUS, CMAES, rare significant events, business.industry, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], on-policy, direct policy search, 010201 computation theory & mathematics, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, on-line
Abstract: This paper shows that the CMAES direct policy search method fares significantly better than PPO gradient policy search for a reinforcement learning task where significant events are rare.
Published: 2021

36. Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with

Author: Paul Ecoffet, Nicolas Fontbonne, Jean-Baptiste André, Nicolas Bredeche, Sorbonne Université (SU), Institut Jean-Nicod (IJN), Département d'Etudes Cognitives - ENS Paris (DEC), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-École des hautes études en sciences sociales (EHESS)-Collège de France (CdF (institution))-Centre National de la Recherche Scientifique (CNRS)-Département de Philosophie - ENS Paris, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)
Subjects: I.2, FOS: Computer and information sciences, Computer Science - Machine Learning, reinforcement learning, PPO, Computer Science - Artificial Intelligence, gradient policy search, [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Machine Learning (cs.LG), continuous state and action spaces, Reward, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Neural and Evolutionary Computing (cs.NE), evolutionary algorithms, CMAES, rare significant events, Multidisciplinary, I.2.6, Computer Science - Neural and Evolutionary Computing, on-policy, Policy, direct policy search, Artificial Intelligence (cs.AI), cooperation and partner choice, Reinforcement, Psychology, Algorithms, on-line
Abstract: This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method.
Published: 2021
Full Text: View/download PDF

37. Many-objective reservoir policy identification and refinement to reduce policy inertia and myopia in water management.

Author: Giuliani, M., Herman, J. D., Castelletti, A., and Reed, P.
Subjects: WATER supply management, WATERSHEDS, MUNICIPAL water supply, VISUAL analytics, NUCLEAR power plants, CONOWINGO Dam (Md.)
Abstract: This study contributes a decision analytic framework to overcome policy inertia and myopia in complex river basin management contexts. The framework combines reservoir policy identification, many-objective optimization under uncertainty, and visual analytics to characterize current operations and discover key trade-offs between alternative policies for balancing competing demands and system uncertainties. The approach is demonstrated on the Conowingo Dam, located within the Lower Susquehanna River, USA. The Lower Susquehanna River is an interstate water body that has been subject to intensive water management efforts due to competing demands from urban water supply, atomic power plant cooling, hydropower production, and federally regulated environmental flows. We have identified a baseline operating policy for the Conowingo Dam that closely reproduces the dynamics of current releases and flows for the Lower Susquehanna and thus can be used to represent the preferences structure guiding current operations. Starting from this baseline policy, our proposed decision analytic framework then combines evolutionary many-objective optimization with visual analytics to discover new operating policies that better balance the trade-offs within the Lower Susquehanna. Our results confirm that the baseline operating policy, which only considers deterministic historical inflows, significantly overestimates the system's reliability in meeting the reservoir's competing demands. Our proposed framework removes this bias by successfully identifying alternative reservoir policies that are more robust to hydroclimatic uncertainties while also better addressing the trade-offs across the Conowingo Dam's multisector services. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

38. Optimized look-ahead tree policies: a bridge between look-ahead tree policies and direct policy search.

Author: Jung, Tobias, Wehenkel, Louis, Ernst, Damien, and Maes, Francis
Subjects: *OPTIMAL control theory, *ADAPTIVE control systems, *SEQUENTIAL analysis, *GLOBAL optimization, *PERTURBATION theory
Abstract: SUMMARY Direct policy search (DPS) and look-ahead tree (LT) policies are two popular techniques for solving difficult sequential decision-making problems. They both are simple to implement, widely applicable without making strong assumptions on the structure of the problem, and capable of producing high-performance control policies. However, computationally, both of them are, each in their own way, very expensive. DPS can require huge offline resources (effort required to obtain the policy) to first select an appropriate space of parameterized policies that works well for the targeted problem and then to determine the best values of the parameters via global optimization. LT policies do not require any offline resources; however, they typically require huge online resources (effort required to calculate the best decision at each step) in order to grow trees of sufficient depth. In this paper, we propose optimized LTs (OLTs), a model-based policy learning scheme that lies at the intersection of DPS and LT. In OLT, the control policy is represented indirectly through an algorithm that at each decision step develops, as in LT by using a model of the dynamics, a small LT until a prespecified online budget is exhausted. Unlike LT, the development of the tree is not driven by a generic heuristic; rather, the heuristic is optimized for the target problem and implemented as a parameterized node scoring function learned offline via DPS. We experimentally compare OLT with pure DPS and pure LT variants on optimal control benchmark domains. The results show that the LT-based representation is a versatile way of compactly representing policies in a DPS scheme (which results in OLT being easier to tune and having lower offline complexity than pure DPS) and at the same time DPS helps to significantly reduce the size of the LTs that are required to take high-quality decisions (which results in OLT having lower online complexity than pure LT). Moreover, OLT produces overall better performing policies than pure DPS and pure LT, and also results in policies that are robust with respect to perturbations of the initial conditions. Copyright © 2013 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

39. Performance of Implicit Stochastic Approaches to the Synthesis of Multireservoir Operating Rules

Author: Giorgio Guariso and Matteo Sangiorgio
Subjects: Implicit stochastic optimization, Mathematical optimization, Artificial neural networks, Artificial neural network, Computer science, Geography, Planning and Development, Management, Monitoring, Policy and Law, Water resources, Genetic algorithm, Nile River basin, Direct policy search, Reservoir management, Dependability, Water Science and Technology, Civil and Structural Engineering
Abstract: With increasing pressure on water resources availability and dependability and constraints due to environmental concerns, the traditional approaches for defining reservoir management rules ...
Published: 2020

40. Multiobjective Direct Policy Search Using Physically Based Operating Rules in Multireservoir Systems

Author: Gerald Corzo, Josias Ritter, Hector Angarita, Dimitri Solomatine, Universitat Politècnica de Catalunya. Doctorat en Enginyeria Civil, and Universitat Politècnica de Catalunya. CRAHI - Centre de Recerca Aplicada en Hidrometeorologia
Subjects: Mathematical optimization, Physics - Physics and Society, Computer science, Process (engineering), Geography, Planning and Development, FOS: Physical sciences, Physics and Society (physics.soc-ph), Management, Monitoring, Policy and Law, Parameterization simulation optimization, Physics - Atmospheric and Oceanic Physics, Atmospheric and Oceanic Physics (physics.ao-ph), Rivers--Regulation, Policy myopia, Direct policy search, Cursos d'aigua -- Regulació -- Models matemàtics, Multiobjective reservoir optimization, Multireservoir systems, Enginyeria civil::Enginyeria hidràulica, marítima i sanitària::Embassaments i preses [Àrees temàtiques de la UPC], Water Science and Technology, Civil and Structural Engineering, Interpretability
Abstract: supplemental_data_wr.1943-5452.0001159_ritter.pdf (492 KB) This study explores the ways to introduce physical interpretability into the process of optimizing operating rules for multireservoir systems with multiple objectives. Prior studies applied the concept of direct policy search (DPS), in which the release policy is expressed as a set of parameterized functions (e.g., neural networks) that are optimized by simulating the performance of different parameter value combinations over a testing period. The problem with this approach is that the operators generally avoid adopting such artificial black-box functions for the direct real-time control of their systems, preferring simpler tools with a clear connection to the system physics. This study addresses this mismatch by replacing the black-box functions in DPS with physically based parameterized operating rules, for example by directly using target levels in dams as decision variables. This leads to results that are physically interpretable and may be more acceptable to operators. The methodology proposed in this work is applied to a network of five reservoirs and four power plants in the Nechi catchment in Colombia, with four interests involved: average energy generation, firm energy generation, flood hazard, and flow regime alteration. The release policy is expressed depending on only 12 parameters, which significantly reduces the computational complexity compared to existing approaches of multiobjective DPS. The resulting four-dimensional Pareto-approximate set offers a variety of operational strategies from which operators may choose one that corresponds best to their preferences. For demonstration purposes, one particular optimized policy is selected and its parameter values are analyzed to illustrate how the physically based operating rules can be directly interpreted by the operators.
Published: 2020

41. Uncertainty-Driven Policies for Resource Allocation in Epidemics Response

Author: den Brok, Emma (author) and den Brok, Emma (author)
Abstract: Humanitarians and global health actors come to the aid of many people every year, with the aim of preventing disease, increasing wellbeing, and providing (medical) aid to those suffering from disease. One of the contexts in which they operate is that of an epidemic. An epidemic is dynamic by nature and provides a complex and evolving environment in which medical aid needs to be provided. A key aspect in a response to an epidemic is logistics – specifically the allocation of resources such as personnel and medical supplies. These resources are often limited, calling for a targeted and strategic response. There is a variety of studies tackling the problem of resource allocation in the context of an epidemic, which include sequential decisions as the epidemic evolves, as well as the choice between several locations to which resources can be sent. However, these studies often assume decision-makers have complete information on the situation at hand and can make “perfect” choices. In reality, due to the large number of actors involved in a response, poor (telecommunication) infrastructure, and the fact that an epidemic is a moving target due to its dynamic nature, decision-makers often have to deal with incomplete and uncertain information on the number of patients and the way the epidemic is evolving., Engineering and Policy Analysis
Published: 2019

42. Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions.

Author: Busoniu, Lucian, Ernst, Damien, De Schutter, Bart, and Babuska, Robert
Subjects: *CROSS-entropy method, *MATHEMATICAL optimization, *APPROXIMATION theory, *MONTE Carlo method, *MARKOV processes, *DECISION making, *RADIAL basis functions, *SIMULATION methods & models, *COMPUTATIONAL complexity
Abstract: This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-loop policy that can be represented using a given number of basis functions (BFs), where a discrete action is assigned to each BF. The type of the BFs and their number are specified in advance and determine the complexity of the representation. Considerable flexibility is achieved by optimizing the locations and shapes of the BFs, together with the action assignments. The optimization is carried out with the cross-entropy method and evaluates the policies by their empirical return from a representative set of initial states. The return for each representative state is estimated using Monte Carlo simulations. The resulting algorithm for cross-entropy policy search with adaptive BFs is extensively evaluated in problems with two to six state variables, for which it reliably obtains good policies with only a small number of BFs. In these experiments, cross-entropy policy search requires vastly fewer BFs than value-function techniques with equidistant BFs, and outperforms policy search with a competing optimization algorithm called DIRECT. [ABSTRACT FROM AUTHOR]
Published: 2011
Full Text: View/download PDF

43. Neuroevolution strategies for episodic reinforcement learning

Author: Heidrich-Meisner, Verena and Igel, Christian
Subjects: *REINFORCEMENT learning, *EVOLUTIONARY computation, *ALGORITHMS, *MATRICES (Mathematics), *ARTIFICIAL neural networks, *ANALYSIS of covariance, *MARKOV processes
Abstract: Abstract: Because of their convincing performance, there is a growing interest in using evolutionary algorithms for reinforcement learning. We propose learning of neural network policies by the covariance matrix adaptation evolution strategy (CMA-ES), a randomized variable-metric search algorithm for continuous optimization. We argue that this approach, which we refer to as CMA Neuroevolution Strategy (CMA-NeuroES), is ideally suited for reinforcement learning, in particular because it is based on ranking policies (and therefore robust against noise), efficiently detects correlations between parameters, and infers a search direction from scalar reinforcement signals. We evaluate the CMA-NeuroES on five different (Markovian and non-Markovian) variants of the common pole balancing problem. The results are compared to those described in a recent study covering several RL algorithms, and the CMA-NeuroES shows the overall best performance. [Copyright &y& Elsevier]
Published: 2009
Full Text: View/download PDF

44. A diagnostic assessment of evolutionary algorithms for multi-objective surface water reservoir control

Author: Jonathan D. Herman, Jazmin Zatarain Salazar, Matteo Giuliani, Andrea Castelletti, and Patrick M. Reed
Subjects: Engineering, Mathematical optimization, Multi-objective evolutionary algorithm, 010504 meteorology & atmospheric sciences, business.industry, Management science, Suite, Reliability (computer networking), 0208 environmental biotechnology, Evolutionary algorithm, Pareto principle, Parameterized complexity, 02 engineering and technology, Benchmarking, Multi-purpose reservoir control, Benchmark, 01 natural sciences, Direct policy search, 020801 environmental engineering, Benchmark (computing), Key (cryptography), business, 0105 earth and related environmental sciences, Water Science and Technology
Abstract: Globally, the pressures of expanding populations, climate change, and increased energy demands are motivating significant investments in re-operationalizing existing reservoirs or designing operating policies for new ones. These challenges require an understanding of the tradeoffs that emerge across the complex suite of multi-sector demands in river basin systems. This study benchmarks our current capabilities to use Evolutionary Multi-Objective Direct Policy Search (EMODPS), a decision analytic framework in which reservoirs’ candidate operating policies are represented using parameterized global approximators (e.g., radial basis functions) then those parameterized functions are optimized using multi-objective evolutionary algorithms to discover the Pareto approximate operating policies. We contribute a comprehensive diagnostic assessment of modern MOEAs’ abilities to support EMODPS using the Conowingo reservoir in the Lower Susquehanna River Basin, Pennsylvania, USA. Our diagnostic results highlight that EMODPS can be very challenging for some modern MOEAs and that epsilon dominance, time-continuation, and auto-adaptive search are helpful for attaining high levels of performance. The ϵ-MOEA, the auto-adaptive Borg MOEA, and ϵ-NSGAII all yielded superior results for the six-objective Lower Susquehanna benchmarking test case. The top algorithms show low sensitivity to different MOEA parameterization choices and high algorithmic reliability in attaining consistent results for different random MOEA trials. Overall, EMODPS poses a promising method for discovering key reservoir management tradeoffs; however algorithmic choice remains a key concern for problems of increasing complexity.
Published: 2016

45. Can modern multi-objective evolutionary algorithms discover high-dimensional financial risk portfolio tradeoffs for snow-dominated water-energy systems?

Author: Gupta, Rohini S., Hamilton, Andrew L., Reed, Patrick M., and Characklis, Gregory W.
Subjects: *EVOLUTIONARY algorithms, *FINANCIAL risk, *FINANCIAL risk management, *ADAPTIVE natural resource management, *GENETIC algorithms
Abstract: • Benchmarking multi-objective financial risk portfolios for snow-driven hydropower. • Self-adaptive search can more effectively capture complex financial risk tradeoffs. • Decomposition and reference point algorithms deteriorate and misrepresent tradeoffs. Hydropower generation in the Hetch Hetchy Power System is strongly tied to snowmelt dynamics in the central Sierra Nevada and consequently is particularly financially vulnerable to changes in snowpack availability and timing. This study explores the Hetchy Hetchy Power System as a representative example from the broader class of financial risk management problems that hold promise in helping utilities such as SFPUC to understand the tradeoffs across portfolios of risk mitigation instruments given uncertainties in snowmelt dynamics. An evolutionary multi-objective direct policy search (EMODPS) framework is implemented to identify time adaptive stochastic rules that map utility state information and exogenous inputs to optimal annual financial decisions. The resulting financial risk mitigation portfolio planning problem is mathematically difficult due to its high dimensionality and mixture of nonlinear, nonconvex, and discrete objectives. These features add to the difficulty of the problem by yielding a Pareto front of solutions that has a highly disjoint and complex geometry. In this study, we contribute a diagnostic assessment of state-of-the-art multi-objective evolutionary algorithms' (MOEAs') abilities to support a DPS framework for managing financial risk. We perform comprehensive diagnostics on five algorithms: the Borg multi-objective evolutionary algorithm, Non-dominated Sorting Genetic Algorithm II (NSGA-II), Non-dominated Sorting Genetic Algorithm III (NSGA-III), Reference Vector Guided Evolutionary Algorithm (RVEA), and the Multi-objective Evolutionary Algorithm Based on Decomposition (MOEA/D). The MOEAs are evaluated to characterize their controllability (ease-of-use), reliability (probability of success), efficiency (minimizing model evaluations), and effectiveness (high quality tradeoff representations). Our results show that newer decomposition, reference point, and reference vector algorithms are highly sensitive to their parameterizations (difficult to use), suffer from search deterioration (losing solutions), and have a strong likelihood of misrepresenting key tradeoffs. The results emphasize the importance of using MOEAs with archiving and adaptive search capabilities in order to solve complex financial risk portfolio problems in snow-dependent water-energy systems. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

46. Balancing exploration, uncertainty and computational demands in many objective reservoir optimization

Author: Matteo Giuliani, Jazmin Zatarain Salazar, J. Quinn, Patrick M. Reed, and Andrea Castelletti
Subjects: Flexibility (engineering), Engineering, Direct policy search, Multi-objective evolutionary optimization, Multi-purpose reservoir control, Parallel strategies, Uncertainty, Water Science and Technology, business.industry, Management science, media_common.quotation_subject, 0208 environmental biotechnology, Evolutionary algorithm, Fidelity, Context (language use), 02 engineering and technology, 020801 environmental engineering, Risk analysis (engineering), Key (cryptography), Production (economics), Quality (business), business, Hydropower, media_common
Abstract: Reservoir operations are central to our ability to manage river basin systems serving conflicting multi-sectoral demands under increasingly uncertain futures. These challenges motivate the need for new solution strategies capable of effectively and efficiently discovering the multi-sectoral tradeoffs that are inherent to alternative reservoir operation policies. Evolutionary many-objective direct policy search (EMODPS) is gaining importance in this context due to its capability of addressing multiple objectives and its flexibility in incorporating multiple sources of uncertainties. This simulation-optimization framework has high potential for addressing the complexities of water resources management, and it can benefit from current advances in parallel computing and meta-heuristics. This study contributes a diagnostic assessment of state-of-the-art parallel strategies for the auto-adaptive Borg Multi Objective Evolutionary Algorithm (MOEA) to support EMODPS. Our analysis focuses on the Lower Susquehanna River Basin (LSRB) system where multiple sectoral demands from hydropower production, urban water supply, recreation and environmental flows need to be balanced. Using EMODPS with different parallel configurations of the Borg MOEA, we optimize operating policies over different size ensembles of synthetic streamflows and evaporation rates. As we increase the ensemble size, we increase the statistical fidelity of our objective function evaluations at the cost of higher computational demands. This study demonstrates how to overcome the mathematical and computational barriers associated with capturing uncertainties in stochastic multiobjective reservoir control optimization, where parallel algorithmic search serves to reduce the wall-clock time in discovering high quality representations of key operational tradeoffs. Our results show that emerging self-adaptive parallelization schemes exploiting cooperative search populations are crucial. Such strategies provide a promising new set of tools for effectively balancing exploration, uncertainty, and computational demands when using EMODPS.
Published: 2017

47. Can Modern Multi-Objective Evolutionary Algorithms Discover High-Dimensional Financial Risk Portfolio Tradeoffs for Snow-Dominated Water-Energy Systems?

Author: Gupta, Rohini
Subjects: Algorithm Benchmarking, Direct Policy Search, Evolutionary Algorithms, Financial Risk Management, Hydropower, Many-Objective Optimization
Abstract: Hydropower-reliant power utilities are becoming increasingly vulnerable to hydrologic variability in states such as California, that have suffered from extensive droughts and reduced winter snowfall. One such utility is the San Francisco Public Utilities Commission (SFPUC), which operates the Hetch Hetchy Power System. SFPUC is strongly reliant on snowmelt from the Sierra Nevada to provide hydropower to San Francisco. Therefore, it is particularly financially vulnerable to changes in snowpack availability and timing, which translates to variability in yearly revenue. Evolutionary multi-objective direct policy search (EMODPS) can be used to identify time adaptive stochastic rules that inform optimal financial decisions based on state and exogenous information. However, the resulting financial risk mitigation portfolio planning problem is difficult to optimize due to its high dimensionality and mixture of nonlinear, nonconvex, and discrete objectives. We contribute a diagnostic assessment of state-of-the-art MOEAs’ abilities to support an EMODPS framework for managing financial risk. We perform comprehensive diagnostics on five algorithms: the Borg multi-objective evolutionary algorithm, Non-dominated Sorting Genetic Algorithm II (NSGA-II), Non-dominated Sorting Genetic Algorithm III (NSGA-III), Reference Vector Guided Evolutionary Algorithm (RVEA), and the Multiobjective Evolutionary Algorithm Based on Decomposition (MOEA/D). MOEA performance is evaluated by analyzing controllability, reliability, efficiency, and effectiveness. The results emphasize the importance of using MOEAs with archiving and adaptive search capabilities in order to solve complex financial risk portfolio problems in snow dependent water-energy systems.
Published: 2019

48. Exemplar-Based Policy with Selectable Strategies and its Optimization Using GA

Subjects: exemplar, direct policy search, genetic algorithm, case based reasoning, Markov decision process
Abstract: As an approach for dynamic control problems and decision making problems, usually formulated as Markov Decision Processes (MDPs), we focus direct policy search (DPS), where a policy is represented by a model with parameters, and the parameters are optimized so as to maximize the evaluation function by applying the parameterized policy to the problem. In this paper, a novel framework for DPS, an exemplar-based policy optimization using genetic algorithm (EBP-GA) is presented and analyzed. In this approach, the policy is composed of a set of virtual exemplars and a case-based action selector, and the set of exemplars are selected and evolved by a genetic algorithm. Here, an exemplar is a real or virtual, free-styled and suggestive information such as ``take the action A at the state S'' or ``the state S1 is better to attain than S2''. One advantage of EBP-GA is the generalization and localization ability for policy expression, based on case-based reasoning methods. Another advantage is that both the introduction of prior knowledge and the extraction of knowledge after optimization are relatively straightforward. These advantages are confirmed through the proposal of two new policy expressions, experiments on two different problems and their analysis.
Published: 2010

49. Season-Dependent Hedging Policies for Reservoir Operation—A Comparison Study.

Author: Bhatia, Nikhil, Srivastav, Roshan, and Srinivasan, Kasthrirengan
Subjects: RESERVOIRS, WATER supply, WATER shortages, PARAMETERIZATION, WATER management
Abstract: During periods of significant water shortage or when drought is impending, it is customary to implement some kind of water supply reduction measures with a view to prevent the occurrence of severe shortages (vulnerability) in the near future. In the case of operation of a water supply reservoir, this reduction of water supply is affected by hedging schemes or hedging policies. This research work aims to compare the popular hedging policies: (i) linear two-point hedging; (ii) modified two-point hedging; and, (iii) discrete hedging based on time-varying and constant hedging parameters. A parameterization-simulation-optimization (PSO) framework is employed for the selection of the parameters of the compromising hedging policies. The multi-objective evolutionary search-based technique (Non-dominated Sorting based Genetic Algorithm-II) was used to identify the Pareto-optimal front of hedging policies that seek to obtain the trade-off between shortage ratio and vulnerability. The case example used for illustration is the Hemavathy reservoir in Karnataka, India. It is observed that the Pareto-optimal front that was obtained from time-varying hedging policies show significant improvement in reservoir performance when compared to constant hedging policies. The variation in the monthly parameters of the time-variant hedging policies shows a strong correlation with monthly inflows and available water. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

50. DIAGNOSTIC ASSESSMENT AND ADVANCEMENT OF MULTI-OBJECTIVE RESERVOIR CONTROL UNDER UNCERTAINTY

Author: Zatarain Salazar, Jazmin
Subjects: Water resources management, Hydrologic sciences, direct policy search, multiobjective evolutionary algorithms, multi-purpose reservoir control, river basin management, Susquehanna River Basin, water resources, Systems science
Abstract: This dissertation contributes to the assessment of new scientific developments for multi-objective decision support to improve multi-purpose river basin management. The main insights of this work highlight opportunities to improve modeling of complex multi-purpose water reservoir systems and opportunities to flexibly incorporate emerging demands and hydro-climatic uncertainty. Additionally, algorithm diagnostics contributed in this work enable the water resources field to better capitalize on the rapid growth in computational power. This opens new opportunities to increase the scope of the problems that can be solved and contribute to the robustness and sustainability of water systems management worldwide. This dissertation focuses on a multi-purpose reservoir system that captures the contextual and mathematical difficulties confronted in a broad range of global multi-purpose systems challenged by multiple competing demands and uncertainty. The first study demonstrates that advances in state of the art multiobjective evolutionary optimization enables to reliably and effectively find control policies that balance conflicting tradeoffs for multi-purpose reservoir control. Multiobjective evolutionary optimization techniques coupled with direct policy search can reliably and flexibly find suitable control policies that adapt to multi-sectorial water needs and to hydro-climatic uncertainty. The second study demonstrates the benefits of cooperative parallel MOEA architectures to reliably and effectively find many objective control policies when the system is subject to uncertainty and computational constraints. The more advanced cooperative, co-evolutionary parallel search expands the scope of problem difficulty that can be reliably addressed while facilitating the discovery of high quality approximations for optimal river basin tradeoffs. The insights from this chapter should enable water resources analysts to devote computational efforts towards representing reservoir systems more accurately by capturing uncertainty and multiple demands when properly using parallel coordinated search. The third study extended multi- purpose reservoir control to better capture flood protection. A risk-averse formulation contributed to the discovery of control policies that improve operations during hydrologic extremes. Overall this dissertation has carefully evaluated and advanced the Evolutionary Multiobjective Direct Policy Search (EMODPS) framework to support multi-objective and robust management of conflicting demands in complex reservoir systems.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

72 results on '"Direct policy search"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources