Author: "Van Hasselt A" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Van Hasselt A"' showing total 6,132 results

Start Over Author "Van Hasselt A"

6,132 results on '"Van Hasselt A"'

1. Normalization and effective learning rates in reinforcement learning

Author: Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, Martens, James, van Hasselt, Hado, Pascanu, Razvan, and Dabney, Will
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting overestimation bias. However, normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate. This becomes problematic in continual learning settings, where the resulting effective learning rate schedule may decay to near zero too quickly relative to the timescale of the learning problem. We propose to make the learning rate schedule explicit with a simple re-parameterization which we call Normalize-and-Project (NaP), which couples the insertion of normalization layers with weight projection, ensuring that the effective learning rate remains constant throughout training. This technique reveals itself as a powerful analytical tool to better understand learning rate schedules in deep reinforcement learning, and as a means of improving robustness to nonstationarity in synthetic plasticity loss benchmarks along with both the single-task and sequential variants of the Arcade Learning Environment. We also show that our approach can be easily applied to popular architectures such as ResNets and transformers while recovering and in some cases even slightly improving the performance of the base model in common stationary benchmarks.
Published: 2024

2. Generation of realistic virtual adult populations using a model-based copula approach

Author: Guo, Yuchen, Guo, Tingjie, Knibbe, Catherijne A. J., Zwep, Laura B., and van Hasselt, J. G. Coen
Published: 2024
Full Text: View/download PDF

3. Changes in Plasma Clearance of CYP450 Probe Drugs May Not be Specific for Altered In Vivo Enzyme Activity Under (Patho)Physiological Conditions: How to Interpret Findings of Probe Cocktail Studies

Author: de Jong, Laura M., van de Kreeke, Marinda, Ahmadi, Mariam, Swen, Jesse J., Knibbe, Catherijne A. J., van Hasselt, J. G. Coen, Manson, Martijn L., and Krekels, Elke H. J.
Published: 2024
Full Text: View/download PDF

4. Impact of Continuous Infusion Meropenem PK/PD Target Attainment on C-Reactive Protein Dynamics in Critically Ill Patients With Documented Gram-Negative Hospital-Acquired or Ventilator-Associated Pneumonia

Author: Troisi, Carla, Cojutti, Pier Giorgio, Rinaldi, Matteo, Tonetti, Tommaso, Siniscalchi, Antonio, van Hasselt, Coen, Viale, Pierluigi, and Pea, Federico
Published: 2024
Full Text: View/download PDF

5. Disentangling the Causes of Plasticity Loss in Neural Networks

Author: Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, van Hasselt, Hado, Pascanu, Razvan, Martens, James, and Dabney, Will
Subjects: Computer Science - Machine Learning
Abstract: Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution. In settings where this assumption is violated, e.g.\ deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. One factor driving this instability is the loss of plasticity, meaning that updating the network's predictions in response to new information becomes more difficult as training progresses. While many recent works provide analyses and partial solutions to this phenomenon, a fundamental question remains unanswered: to what extent do known mechanisms of plasticity loss overlap, and how can mitigation strategies be combined to best maintain the trainability of a network? This paper addresses these questions, showing that loss of plasticity can be decomposed into multiple independent mechanisms and that, while intervening on any single mechanism is insufficient to avoid the loss of plasticity in all cases, intervening on multiple mechanisms in conjunction results in highly robust learning algorithms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks, and further demonstrate its effectiveness on naturally arising nonstationarities, including reinforcement learning in the Arcade Learning Environment.
Published: 2024

6. Diagnostic accuracy of an artificial intelligence algorithm versus radiologists for fracture detection on cervical spine CT

Author: van den Wittenboer, Gaby J., van der Kolk, Brigitta Y. M., Nijholt, Ingrid M., Langius-Wiffen, Eline, van Dijk, Rogier A., van Hasselt, Boudewijn A. A. M., Podlogar, Martin, van den Brink, Wimar A., Bouma, Gert Joan, Schep, Niels W. L., Maas, Mario, and Boomsma, Martijn F.
Published: 2024
Full Text: View/download PDF

7. A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Author: Pignatelli, Eduardo, Ferret, Johan, Geist, Matthieu, Mesnard, Thomas, van Hasselt, Hado, Pietquin, Olivier, and Toni, Laura
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The Credit Assignment Problem (CAP) refers to the longstanding challenge of Reinforcement Learning (RL) agents to associate actions with their long-term consequences. Solving the CAP is a crucial step towards the successful deployment of RL in the real world since most decision problems provide feedback that is noisy, delayed, and with little or no information about the causes. These conditions make it hard to distinguish serendipitous outcomes from those caused by informed decision-making. However, the mathematical nature of credit and the CAP remains poorly understood and defined. In this survey, we review the state of the art of Temporal Credit Assignment (CA) in deep RL. We propose a unifying formalism for credit that enables equitable comparisons of state-of-the-art algorithms and improves our understanding of the trade-offs between the various methods. We cast the CAP as the problem of learning the influence of an action over an outcome from a finite amount of experience. We discuss the challenges posed by delayed effects, transpositions, and a lack of action influence, and analyse how existing methods aim to address them. Finally, we survey the protocols to evaluate a credit assignment method and suggest ways to diagnose the sources of struggle for different methods. Overall, this survey provides an overview of the field for new-entry practitioners and researchers, it offers a coherent perspective for scholars looking to expedite the starting stages of a new study on the CAP, and it suggests potential directions for future research., Comment: 56 pages, 2 figures, 4 tables
Published: 2023

8. QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

Author: van den Maagdenberg, Helle W., Šícho, Martin, Araripe, David Alencar, Luukkonen, Sohvi, Schoenmaker, Linde, Jespers, Michiel, Béquignon, Olivier J. M., González, Marina Gorostiola, van den Broek, Remco L., Bernatavicius, Andrius, van Hasselt, J. G. Coen, van der Graaf, Piet. H., and van Westen, Gerard J. P.
Published: 2024
Full Text: View/download PDF

9. QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

Author: Helle W. van den Maagdenberg, Martin Šícho, David Alencar Araripe, Sohvi Luukkonen, Linde Schoenmaker, Michiel Jespers, Olivier J. M. Béquignon, Marina Gorostiola González, Remco L. van den Broek, Andrius Bernatavicius, J. G. Coen van Hasselt, Piet. H. van der Graaf, and Gerard J. P. van Westen
Subjects: QSPR modelling, QSAR modelling, Proteochemometrics, Cheminformatics, Machine learning, Software, Information technology, T58.5-58.64, Chemistry, QD1-999
Abstract: Abstract Building reliable and robust quantitative structure–property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously growing and evaluating different algorithms and methodologies can be arduous. Finally, the last hurdle that researchers face is to ensure the reproducibility of their models and facilitate their transferability into practice. In this work, we introduce QSPRpred, a toolkit for analysis of bioactivity data sets and QSPR modelling, which attempts to address the aforementioned challenges. QSPRpred’s modular Python API enables users to intuitively describe different parts of a modelling workflow using a plethora of pre-implemented components, but also integrates customized implementations in a “plug-and-play” manner. QSPRpred data sets and models are directly serializable, which means they can be readily reproduced and put into operation after training as the models are saved with all required data pre-processing steps to make predictions on new compounds directly from SMILES strings. The general-purpose character of QSPRpred is also demonstrated by inclusion of support for multi-task and proteochemometric modelling. The package is extensively documented and comes with a large collection of tutorials to help new users. In this paper, we describe all of QSPRpred’s functionalities and also conduct a small benchmarking case study to illustrate how different components can be leveraged to compare a diverse set of models. QSPRpred is fully open-source and available at https://github.com/CDDLeiden/QSPRpred . Scientific Contribution QSPRpred aims to provide a complex, but comprehensive Python API to conduct all tasks encountered in QSPR modelling from data preparation and analysis to model creation and model deployment. In contrast to similar packages, QSPRpred offers a wider and more exhaustive range of capabilities and integrations with many popular packages that also go beyond QSPR modelling. A significant contribution of QSPRpred is also in its automated and highly standardized serialization scheme, which significantly improves reproducibility and transferability of models.
Published: 2024
Full Text: View/download PDF

10. A Definition of Continual Reinforcement Learning

Author: Abel, David, Barreto, André, Van Roy, Benjamin, Precup, Doina, van Hasselt, Hado, and Singh, Satinder
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: In a standard view of the reinforcement learning problem, an agent's goal is to efficiently identify a policy that maximizes long-term reward. However, this perspective is based on a restricted view of learning as finding a solution, rather than treating learning as endless adaptation. In contrast, continual reinforcement learning refers to the setting in which the best agents never stop learning. Despite the importance of continual reinforcement learning, the community lacks a simple definition of the problem that highlights its commitments and makes its primary concepts precise and clear. To this end, this paper is dedicated to carefully defining the continual reinforcement learning problem. We formalize the notion of agents that "never stop learning" through a new mathematical language for analyzing and cataloging agents. Using this new language, we define a continual learning agent as one that can be understood as carrying out an implicit search process indefinitely, and continual reinforcement learning as the setting in which the best agents are all continual learning agents. We provide two motivating examples, illustrating that traditional views of multi-task reinforcement learning and continual supervised learning are special cases of our definition. Collectively, these definitions and perspectives formalize many intuitive concepts at the heart of learning, and open new research pathways surrounding continual learning agents., Comment: NeurIPS 2023
Published: 2023

11. On the Convergence of Bounded Agents

Author: Abel, David, Barreto, André, van Hasselt, Hado, Van Roy, Benjamin, Precup, Doina, and Singh, Satinder
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: When has an agent converged? Standard models of the reinforcement learning problem give rise to a straightforward definition of convergence: An agent converges when its behavior or performance in each environment state stops changing. However, as we shift the focus of our learning problem from the environment's state to the agent's state, the concept of an agent's convergence becomes significantly less clear. In this paper, we propose two complementary accounts of agent convergence in a framing of the reinforcement learning problem that centers around bounded agents. The first view says that a bounded agent has converged when the minimal number of states needed to describe the agent's future behavior cannot decrease. The second view says that a bounded agent has converged just when the agent's performance only changes if the agent's internal state changes. We establish basic properties of these two definitions, show that they accommodate typical views of convergence in standard settings, and prove several facts about their nature and relationship. We take these perspectives, definitions, and analysis to bring clarity to a central idea of the field.
Published: 2023

12. Association between Increased Risk of Pneumonia with ICS in COPD: A Continuous Variable Analysis of Patient Factors from the IMPACT Study

Author: Aggarwal, Bhumika, Jones, Paul, Casas, Alejandro, Gomes, Mauro, Juthong, Siwasak, Litewka, Diego, Ong-Dela Cruz, Bernice, Ramirez-Venegas, Alejandra, Sayiner, Abdullah, van Hasselt, James, Compton, Chris, Tombs, Lee, Weng, Stephen, and Levy, Gur
Published: 2024
Full Text: View/download PDF

13. Semi-mechanistic modeling of resistance development to β-lactam and β-lactamase-inhibitor combinations

Author: Tandar, Sebastian T., Aulin, Linda B.S., Leemkuil, Eva M. J., Liakopoulos, Apostolos, and van Hasselt, J. G. Coen
Published: 2024
Full Text: View/download PDF

14. Seasonal variation in sleep time: jackdaws sleep when it is dark, but do they really need it?

Author: van Hasselt, Sjoerd J., Coscia, Massimiliano, Allocca, Giancarlo, Vyssotski, Alexei L., and Meerlo, Peter
Published: 2024
Full Text: View/download PDF

15. Predictions of Bedaquiline Central Nervous System Exposure in Patients with Tuberculosis Meningitis Using Physiologically based Pharmacokinetic Modeling

Author: Mehta, Krina, Balazki, Pavel, van der Graaf, Piet H., Guo, Tingjie, and van Hasselt, J. G. Coen
Published: 2024
Full Text: View/download PDF

16. Brusatol attenuated proliferation and invasion induced by KRAS in differentiated thyroid cancer through inhibiting Nrf2

Author: Gong, Z., Xue, L., Vlantis, A. C., van Hasselt, C. A., Chan, J. Y. K., Fang, J., Wang, R., Yang, Y., Li, D., Zeng, X., Tong, M. C. F., and Chen, G. G.
Published: 2024
Full Text: View/download PDF

17. Exploration via Epistemic Value Estimation

Author: Schmitt, Simon, Shawe-Taylor, John, and van Hasselt, Hado
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: How to efficiently explore in reinforcement learning is an open problem. Many exploration algorithms employ the epistemic uncertainty of their own value predictions -- for instance to compute an exploration bonus or upper confidence bound. Unfortunately the required uncertainty is difficult to estimate in general with function approximation. We propose epistemic value estimation (EVE): a recipe that is compatible with sequential decision making and with neural network function approximators. It equips agents with a tractable posterior over all their parameters from which epistemic value uncertainty can be computed efficiently. We use the recipe to derive an epistemic Q-Learning agent and observe competitive performance on a series of benchmarks. Experiments confirm that the EVE recipe facilitates efficient exploration in hard exploration tasks.
Published: 2023

18. Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

Author: Jiang, Chentian, Ke, Nan Rosemary, and van Hasselt, Hado
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks. We focus on the problem of in-context adaptation and exploration, where an agent only relies on context, i.e., history of states, actions and/or rewards, rather than gradient-based updates. Posterior sampling (extension of Thompson sampling) is a promising approach, but it requires Bayesian inference and dynamic programming, which often involve unknowns (e.g., a prior) and costly computations. To address these difficulties, we use a transformer to learn an inference process from training tasks and consider a hypothesis space of partial models, represented as small Markov decision processes that are cheap for dynamic programming. In our version of the Symbolic Alchemy benchmark, our method's adaptation speed and exploration-exploitation balance approach those of an exact posterior sampling oracle. We also show that even though partial models exclude relevant information from the environment, they can nevertheless lead to good policies., Comment: In proceedings of the Reincarnating Reinforcement Learning (RRL) Workshop at ICLR 2023 and the Neuro-Symbolic AI for Agent and Multi-Agent Systems (NeSyMAS) Workshop at AAMAS 2023
Published: 2023

19. Optimistic Meta-Gradients

Author: Flennerhag, Sebastian, Zahavy, Tom, O'Donoghue, Brendan, van Hasselt, Hado, György, András, and Singh, Satinder
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: We study the connection between gradient-based meta-learning and convex op-timisation. We observe that gradient descent with momentum is a special case of meta-gradients, and building on recent results in optimisation, we prove convergence rates for meta-learning in the single task setting. While a meta-learned update rule can yield faster convergence up to constant factor, it is not sufficient for acceleration. Instead, some form of optimism is required. We show that optimism in meta-learning can be captured through Bootstrapped Meta-Gradients (Flennerhag et al., 2022), providing deeper insight into its underlying mechanics.
Published: 2023

20. Interspecies interactions alter the antibiotic sensitivity of Pseudomonas aeruginosa

Author: C. I. M. Koumans, S. T. Tandar, A. Liakopoulos, and J. G. C. van Hasselt
Subjects: antimicrobial chemotherapy, polymicrobial infections, pharmacology, cystic fibrosis, PK-PD, Microbiology, QR1-502
Abstract: ABSTRACT Polymicrobial infections are infections that are caused by multiple pathogens and are common in patients with cystic fibrosis (CF). Although polymicrobial infections are associated with poor treatment responses in CF, the effects of the ecological interactions between co-infecting pathogens on antibiotic sensitivity and treatment outcome are poorly characterized. To this end, we systematically quantified the impact of these effects on the antibiotic sensitivity of Pseudomonas aeruginosa for nine antibiotics in medium conditioned by 13 secondary cystic fibrosis-associated bacterial and fungal pathogens through time-kill assays. We fitted pharmacodynamic models to these kill curves for each antibiotic-species combination and found that interspecies interactions changing the antibiotic sensitivity of P. aeruginosa are abundant. Interactions that lower antibiotic sensitivity are more common than those that increase it, with generally more substantial reductions than increases in sensitivity. For a selection of co-infecting species, we performed pharmacokinetic–pharmacodynamic modeling of P. aeruginosa treatment. We predicted that interspecies interactions can either improve or reduce treatment response to the extent that treatment is rendered ineffective from a previously effective antibiotic dosing schedule and vice versa. In summary, we show that quantifying the ecological interaction effects as pharmacodynamic parameters is necessary to determine the abundance and the extent to which these interactions affect antibiotic sensitivity in polymicrobial infections.IMPORTANCEIn cystic fibrosis (CF) patients, chronic respiratory tract infections are often polymicrobial, involving multiple pathogens simultaneously. Polymicrobial infections are difficult to treat as they often respond unexpectedly to antibiotic treatment, which might possibly be explained because co-infecting pathogens can influence each other’s antibiotic sensitivity, but it is unknown to what extent such effects occur. To investigate this, we systematically quantified the impact of co-infecting species on antibiotic sensitivity, focusing on P. aeruginosa, a common CF pathogen. We studied for a large set co-infecting species and antibiotics whether changes in antibiotic response occur. Based on these experiments, we used mathematical modeling to simulate P. aeruginosa’s response to colistin and tobramycin treatment in the presence of multiple pathogens. This study offers comprehensive data on altered antibiotic sensitivity of P. aeruginosa in polymicrobial infections, serves as a foundation for optimizing treatment of such infections, and consolidates the importance of considering co-infecting pathogens.
Published: 2024
Full Text: View/download PDF

21. Emergency Communication Operators: Findings from the National Wellness Survey for Public Safety Personnel

Author: Blalock, Jessica R., Black, Ryan A., Bourke, Michael L., and Van Hasselt, Vincent B.
Published: 2024
Full Text: View/download PDF

22. Human-level Atari 200x faster

Author: Kapturowski, Steven, Campos, Víctor, Jiang, Ray, Rakićević, Nemanja, van Hasselt, Hado, Blundell, Charles, and Badia, Adrià Puigdomènech
Subjects: Computer Science - Machine Learning
Abstract: The task of building general agents that perform well over a wide range of tasks has been an important goal in reinforcement learning since its inception. The problem has been subject of research of a large body of work, with performance frequently measured by observing scores over the wide range of environments contained in the Atari 57 benchmark. Agent57 was the first agent to surpass the human benchmark on all 57 games, but this came at the cost of poor data-efficiency, requiring nearly 80 billion frames of experience to achieve. Taking Agent57 as a starting point, we employ a diverse set of strategies to achieve a 200-fold reduction of experience needed to out perform the human baseline. We investigate a range of instabilities and bottlenecks we encountered while reducing the data regime, and propose effective solutions to build a more robust and efficient agent. We also demonstrate competitive performance with high-performing methods such as Muesli and MuZero. The four key components to our approach are (1) an approximate trust region method which enables stable bootstrapping from the online network, (2) a normalisation scheme for the loss and priorities which improves robustness when learning a set of value functions with a wide range of scales, (3) an improved architecture employing techniques from NFNets in order to leverage deeper networks without the need for normalization layers, and (4) a policy distillation method which serves to smooth out the instantaneous greedy policy overtime.
Published: 2022

23. Longitudinal metabolite profiling of Streptococcus pneumoniae-associated community-acquired pneumonia

Author: den Hartog, Ilona, Zwep, Laura B., Meulman, Jacqueline J., Hankemeier, Thomas, van de Garde, Ewoudt M. W., and van Hasselt, J. G. Coen
Published: 2024
Full Text: View/download PDF

24. Interplay of virulence factors shapes ecology and treatment outcomes in polymicrobial infections

Author: Herzberg, C., van Meegen, E.N., and van Hasselt, J.G.C.
Published: 2024
Full Text: View/download PDF

25. Association between Increased Risk of Pneumonia with ICS in COPD: A Continuous Variable Analysis of Patient Factors from the IMPACT Study

Author: Bhumika Aggarwal, Paul Jones, Alejandro Casas, Mauro Gomes, Siwasak Juthong, Diego Litewka, Bernice Ong-Dela Cruz, Alejandra Ramirez-Venegas, Abdullah Sayiner, James van Hasselt, Chris Compton, Lee Tombs, Stephen Weng, and Gur Levy
Subjects: IMPACT, Post hoc analysis, Pneumonia risk, COPD, ICS, Diseases of the respiratory system, RC705-779
Abstract: Abstract Introduction Despite the proven benefits of inhaled corticosteroid (ICS)-containing triple therapy for chronic obstructive pulmonary disease (COPD), clinicians limit patient exposure to ICS due to the risk of pneumonia. However, there are multiple factors associated with the risk of pneumonia in patients with COPD. This post hoc analysis of IMPACT trial data aims to set the risks associated with ICS into a context of specific patient-related factors that contribute to the risk of pneumonia. Methods The 52-week, double-blind IMPACT trial randomized patients with symptomatic COPD and ≥1 exacerbation in the prior year 2:2:1 to once-daily fluticasone furoate (FF)/umeclidinium (UMEC)/vilanterol (VI), FF/VI or UMEC/VI. Annual rate of on-treatment pneumonias in the intent-to-treat population associated with age, body mass index (BMI), percent predicted forced expiratory volume in 1 s (FEV1) and blood eosinophil count (BEC) was evaluated. Results This analysis revealed that the annual rate of pneumonia showed the lowest risk at the age of 50 years. The 95% confidence intervals (CI) between ICS-containing and non-ICS containing treatments diverged in ages > 63 years, suggesting a significantly increased ICS-related risk in older patients. In contrast, the annual rate of pneumonia rose in both groups below BMI of 22.5 kg/m2, but above that, there was no relationship to pneumonia rate and no differential effect between the two groups. The relationship between BEC and pneumonia was flat up to > 300/µL cells with ICS-containing treatment and then rose. In contrast, the rate of pneumonia with non-ICS containing treatment appeared to increase at a lower level of BEC (~ 200/µL). Conclusions There was little evidence of a differential effect of older age, lower BMI, lower FEV1 and BEC on the pneumonia rate between ICS-containing and non-ICS containing treatments. This analysis points to the need for a balanced approach to risk versus benefit in the use of ICS-containing treatments in COPD. Clinical trial registration IMPACT ClinicalTrials.gov number, NCT02164513.
Published: 2024
Full Text: View/download PDF

26. Selective Credit Assignment

Author: Chelu, Veronica, Borsa, Diana, Precup, Doina, and van Hasselt, Hado
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Efficient credit assignment is essential for reinforcement learning algorithms in both prediction and control settings. We describe a unified view on temporal-difference algorithms for selective credit assignment. These selective algorithms apply weightings to quantify the contribution of learning updates. We present insights into applying weightings to value-based learning and planning algorithms, and describe their role in mediating the backward credit distribution in prediction and control. Within this space, we identify some existing online learning algorithms that can assign credit selectively as special cases, as well as add new algorithms that assign credit backward in time counterfactually, allowing credit to be assigned off-trajectory and off-policy.
Published: 2022

27. Chaining Value Functions for Off-Policy Learning

Author: Schmitt, Simon, Shawe-Taylor, John, and van Hasselt, Hado
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: To accumulate knowledge and improve its policy of behaviour, a reinforcement learning agent can learn `off-policy' about policies that differ from the policy used to generate its experience. This is important to learn counterfactuals, or because the experience was generated out of its own control. However, off-policy learning is non-trivial, and standard reinforcement-learning algorithms can be unstable and divergent. In this paper we discuss a novel family of off-policy prediction algorithms which are convergent by construction. The idea is to first learn on-policy about the data-generating behaviour, and then bootstrap an off-policy value estimate on this on-policy estimate, thereby constructing a value estimate that is partially off-policy. This process can be repeated to build a chain of value functions, each time bootstrapping a new estimate on the previous estimate in the chain. Each step in the chain is stable and hence the complete algorithm is guaranteed to be stable. Under mild conditions this comes arbitrarily close to the off-policy TD solution when we increase the length of the chain. Hence it can compute the solution even in cases where off-policy TD diverges. We prove that the proposed scheme is convergent and corresponds to an iterative decomposition of the inverse key matrix. Furthermore it can be interpreted as estimating a novel objective -- that we call a `k-step expedition' -- of following the target policy for finitely many steps before continuing indefinitely with the behaviour policy. Empirically we evaluate the idea on challenging MDPs such as Baird's counter example and observe favourable results.
Published: 2022

28. Unraveling the Effects of Acute Inflammation on Pharmacokinetics: A Model-Based Analysis Focusing on Renal Glomerular Filtration Rate and Cytochrome P450 3A4-Mediated Metabolism

Author: Liu, Feiyan, Aulin, Linda B. S., Manson, Martijn L., Krekels, Elke H. J., and van Hasselt, J. G. Coen
Published: 2023
Full Text: View/download PDF

29. Troubled in school: does maternal involvement matter for adolescents?

Author: Norris, Jonathan and van Hasselt, Martijn
Published: 2023
Full Text: View/download PDF

30. Behavioral Health Training and Peer Support Programs

Author: Pressley, Hannah, Blalock, Jessica R., Van Hasselt, Vincent B., Bourke, Michael L., editor, Van Hasselt, Vincent B., editor, and Buser, Sam J., editor
Published: 2023
Full Text: View/download PDF

31. Emergency Communications Operators

Author: Beamer, Angela T., Thomas, Tara D., White, Sheri L., Van Hasselt, Vincent B., Bourke, Michael L., editor, Van Hasselt, Vincent B., editor, and Buser, Sam J., editor
Published: 2023
Full Text: View/download PDF

32. Suicide and Self-Harm in the Military

Author: Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., Van Hasselt, Vincent B., Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., and Van Hasselt, Vincent B.
Published: 2023
Full Text: View/download PDF

33. Intimate Partner and Domestic Violence Among Military Populations

Author: Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., Van Hasselt, Vincent B., Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., and Van Hasselt, Vincent B.
Published: 2023
Full Text: View/download PDF

34. Clinical Implications, Limitations, Future Directions, and Conclusions

Author: Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., Van Hasselt, Vincent B., Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., and Van Hasselt, Vincent B.
Published: 2023
Full Text: View/download PDF

35. Violent Criminal Behavior in the Military

Author: Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., Van Hasselt, Vincent B., Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., and Van Hasselt, Vincent B.
Published: 2023
Full Text: View/download PDF

36. Military Sexual Violence: Sexual Assault, Sexual Harassment, and Sexual Hazing

Author: Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., Van Hasselt, Vincent B., Baker, Monty T., Ojeda, Alyssa R., Pressley, Hannah, Blalock, Jessica, Martinez, Riki Ann, Moore, Brian A., and Van Hasselt, Vincent B.
Published: 2023
Full Text: View/download PDF

37. Self-Consistent Models and Values

Author: Farquhar, Gregory, Baumli, Kate, Marinho, Zita, Filos, Angelos, Hessel, Matteo, van Hasselt, Hado, and Silver, David
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment. In particular, models enable planning, i.e. using more computation to improve value functions or policies, without requiring additional environment interactions. In this work, we investigate a way of augmenting model-based RL, by additionally encouraging a learned model and value function to be jointly \emph{self-consistent}. Our approach differs from classic planning methods such as Dyna, which only update values to be consistent with the model. We propose multiple self-consistency updates, evaluate these in both tabular and function approximation settings, and find that, with appropriate choices, self-consistency helps both policy evaluation and control., Comment: NeurIPS 2021
Published: 2021

38. Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity

Author: Garnelo, Marta, Czarnecki, Wojciech Marian, Liu, Siqi, Tirumala, Dhruva, Oh, Junhyuk, Gidel, Gauthier, van Hasselt, Hado, and Balduzzi, David
Subjects: Computer Science - Artificial Intelligence
Abstract: Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance. Furthermore, in games with non-transitivities diversity allows a player to cover several winning strategies. However, despite the significance of strategic diversity, training agents that exhibit diverse behaviour remains a challenge. In this paper we study how to construct diverse populations of agents by carefully structuring how individuals within a population interact. Our approach is based on interaction graphs, which control the flow of information between agents during training and can encourage agents to specialise on different strategies, leading to improved overall performance. We provide evidence for the importance of diversity in multi-agent training and analyse the effect of applying different interaction graphs on the training trajectories, diversity and performance of populations in a range of games. This is an extended version of the long abstract published at AAMAS.
Published: 2021

39. Collateral-based selection for endovascular treatment of acute ischaemic stroke in the late window (MR CLEAN-LATE): 2-year follow-up of a phase 3, multicentre, open-label, randomised controlled trial in the Netherlands

Author: van Oostenbrugge, Robert, van Zwam, Wim, Olthuis, Susanne, Pirson, Anne, Hinsenveld, Wouter, Goldhoorn, Robert-Jan, Staals, Julie, Dippel, Diederik, van der Lugt, Aad, van Es, Adriaan, Roozenbeek, Bob, van Doormaal, Pieter-Jan, Roos, Yvo, Majoie, Charles, Coutinho, Jonathan, Emmer, Bart, van der Worp, Bart, Lo, Rob, van Walderveen, Marianne, Wermer, Marieke, van Dijk, Ewoud, Jenniskens, Sjoerd, Boogaarts, Hieronymus, Uyttenboogaart, Maarten, Bokkers, Reinoud, Keizer, Koos, Gons, Rob, Yo, Lonneke, den Hertog, Heleen, van Hasselt, Boudewijn, Schonewille, Wouter, Vos, Jan-Albert, van Tuijl, Julia, Boukrab, Issam, Kortman, Hans, Hofmeijer, Jeannette, Martens, Jasper, van den Wijngaard, Ido, Boiten, Jelis, Lycklama à Nijeholt, Geert, Brouwers, Paul, Sturm, Emiel, Bulut, Tomas, de Laat, Karlijn, van Dijk, Lukas, Remmers, Michel, de Jong, Thijs, Rozeman, Anouk, Elgersma, Otto, Van der Veen, Bas, Sudiono, Davy, Mattle, Heinrich, Fiehler, Jens, van Kuijk, Sander, Nieboer, Daan, Lingsma, Hester, van Nuland, Rick, Roosendaal, Stefan, Krietemeijer, Menno, Postma, Alida, Van den Berg, René, Beenen, Ludo, Hammer, Sebastiaan, Meijer, Anton, van der Hoorn, Anouk, Yoo, Albert, Gerrits, Dick, Jansen, Ben, Truijman, Martine, Manschot, Sanne, Kerkhoff, Henk, Koudstaal, Peter, Chalos, Vicky, Berkhemer, Olvert, Versteeg, Adriaan, Wolff, Lennard, Su, Jiahang, van der Sluijs, Matthijs, van Voorst, Henk, Tolhuisen, Manon, ten Cate, Hugo, de Maat, Moniek, Donse-Donkel, Samantha, van Beusekom, Heleen, Taha, Aladdin, Barakzie, Aarazo, Treurniet, Kilian, van den Berg, Sophie, LeCouffe, Natalie, van de Graaf, Rob, de Ridder, Inger, Pinckaers, Florentina, Ceulemans, Angelique, Knapen, Robrecht, Robbe, Quirien, Sondag, Lotte, Kappelhof, Manon, Reinink, Rik, Silvis, Suzanne, Schreuder, Floris, Uniken Venema, Simone, van Meenen, Laura, Collette, Sabine, van Wijngaarden, Wilma, van der Steen, Wouter, Hoving, Jan, Verheesen, Sabrina, Sterrenberg, Martin, El Ghannouti, Naziha, Sprengers, Rita, van Ahee, Ayla, Zweedijk, Berber, Pellikaan, Wilma, Schonewille, Irati, Blauwendraat, Kitty, Drabbe, Yvonne, Kleine-Kathöfer, Anke, de Meris, Joke, Sandiman, Michelle, Dofferhoff-Vermeulen, Tamara, Simons, Michelle, Bongenaar, Hester, Smallegange, Maylee, van Loon, Anja, Kraus, Karin, Bos-Verheij, Erna, Santegoets, Ester, Kooij, Suze, Slotboom, Annemarie, Ponjee, Eva, Eilander, Rieke, Droste, Hanneke, van Veen, Esther, Visser, Rosalie, Lodico, Jasmijn, de Jong, Marieke, van der Minne, Friedus, Cleophas, Eefje, Muskens, Ernst, Nijst, Amy, Heiligers, Leontien, Martens, Yvonne, Slotboom, Miranda, Hintzen, Rogier, Jacobs, Bart, Huijberts, Ilse, Pinckaers, Florentina M E, Olthuis, Susanne G H, van Kuijk, Sander M J, Postma, Alida A, Boogaarts, Hieronymus D, Roos, Yvo B W E M, Majoie, Charles B L M, Dippel, Diederik W J, van Zwam, Wim H, and van Oostenbrugge, Robert J
Published: 2024
Full Text: View/download PDF

40. Construction of a novel six-gene signature to predict tumour response to induction chemotherapy and overall survival in locoregionally advanced laryngeal and hypopharyngeal carcinoma

Author: Chen Tan, Lingwa Wang, Yifan Yang, Shizhi He, George G. Chen, Jason YK. Chan, Michael CF. Tong, C.A. van Hasselt, Wenbin Xu, Ling Feng, Ru Wang, and Jugao Fang
Subjects: Medicine (General), R5-920, Genetics, QH426-470
Published: 2024
Full Text: View/download PDF

41. Introducing Symmetries to Black Box Meta Reinforcement Learning

Author: Kirsch, Louis, Flennerhag, Sebastian, van Hasselt, Hado, Friesen, Abram, Oh, Junhyuk, and Chen, Yutian
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing, Statistics - Machine Learning
Abstract: Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and the learning algorithm are jointly represented by a single neural network. These methods are very flexible, but they tend to underperform in terms of generalisation to new, unseen environments. In this paper, we explore the role of symmetries in meta-generalisation. We show that a recent successful meta RL approach that meta-learns an objective for backpropagation-based learning exhibits certain symmetries (specifically the reuse of the learning rule, and invariance to input and output permutations) that are not present in typical black-box meta RL systems. We hypothesise that these symmetries can play an important role in meta-generalisation. Building off recent work in black-box supervised meta learning, we develop a black-box meta RL system that exhibits these same symmetries. We show through careful experimentation that incorporating these symmetries can lead to algorithms with a greater ability to generalise to unseen action & observation spaces, tasks, and environments., Comment: AAAI 2022
Published: 2021

42. Bootstrapped Meta-Learning

Author: Flennerhag, Sebastian, Schroecker, Yannick, Zahavy, Tom, van Hasselt, Hado, Silver, David, and Singh, Satinder
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimisation problem. We propose an algorithm that tackles this problem by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance to that target under a chosen (pseudo-)metric. Focusing on meta-learning with gradients, we establish conditions that guarantee performance improvements and show that the metric can control meta-optimisation. Meanwhile, the bootstrapping mechanism can extend the effective meta-learning horizon without requiring backpropagation through all updates. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark and demonstrate that it yields both performance and efficiency gains in multi-task meta-learning. Finally, we explore how bootstrapping opens up new possibilities and find that it can meta-learn efficient exploration in an epsilon-greedy Q-learning agent, without backpropagating through the update rule., Comment: Published at ICLR 2022. 37 pages, 19 figures, 9 tables
Published: 2021

43. Learning Expected Emphatic Traces for Deep RL

Author: Jiang, Ray, Zhang, Shangtong, Chelu, Veronica, White, Adam, and van Hasselt, Hado
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Off-policy sampling and experience replay are key for improving sample efficiency and scaling model-free temporal difference learning methods. When combined with function approximation, such as neural networks, this combination is known as the deadly triad and is potentially unstable. Recently, it has been shown that stability and good performance at scale can be achieved by combining emphatic weightings and multi-step updates. This approach, however, is generally limited to sampling complete trajectories in order, to compute the required emphatic weighting. In this paper we investigate how to combine emphatic weightings with non-sequential, off-line data sampled from a replay buffer. We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed $n$-step TD learning algorithm to learn the required emphatic weighting. We show that these state weightings reduce variance compared with prior approaches, while providing convergence guarantees. We tested the approach at scale on Atari 2600 video games, and observed that the new X-ETD($n$) agent improved over baseline agents, highlighting both the scalability and broad applicability of our approach.
Published: 2021

44. Association Between Adult Antibiotic Use, Microbial Dysbiosis and Atopic Conditions – A Systematic Review

Author: Ng WZJ, van Hasselt J, Aggarwal B, and Manoharan A
Subjects: atopy, allergic, asthma. microbiome, Immunologic diseases. Allergy, RC581-607
Abstract: Wan Zhen Janice Ng,1,&ast; James van Hasselt,2,&ast; Bhumika Aggarwal,3 Anand Manoharan4 1School of Biological Sciences, Nanyang Technological University, Singapore, Singapore; 2GSK, Regional Medical Affairs, Bryanston, Gauteng, South Africa; 3Regional Respiratory Medical Affairs, GSK Plc, Singapore, Singapore; 4Infectious Diseases Medical & Scientific Affairs, GSK, Mumbai, India&ast;These authors contributed equally to this workCorrespondence: James van Hasselt, GSK, The Campus, Flushing Meadows, 57 Sloane Street, Bryanston, Gauteng, 2021, South Africa, Tel +27-82-330-8489, Email james.d.van-hasselt@gsk.comBackground: Strong associations between early antibiotic exposure and increased risk of childhood allergies have been established. Antibiotics have the potential to induce microbial dysbiosis that may be linked to allergic conditions. This review examines the limited available evidence on the associations between adult antibiotic use, microbial dysbiosis and atopic conditions.Methods: A systematic literature search was conducted using PubMed and Embase for relevant studies, published between 01– 01– 2000 and 08– 17– 2022. We searched for associations between antibiotic use, microbial dysbiosis, and allergic conditions in adults, defined as over 13 years of age for the purposes of this review.Results: Twenty-one studies were analyzed, with the inclusion of four narrative reviews as scarce relevant literature was found when stricter selection criteria were employed. Relevant studies predominantly focused on asthma. Significant microbial differences were observed in most measures between healthy subjects and subjects with allergic conditions. However, no system-wise and strain-wise associations were evident. Notably, at the phyla level, the Bacillota and Pseudomonadota phyla were associated with asthmatics, while the Actinobacteria phylum was linked to healthy controls. Asthmatics tends to reflect upregulation in the Bacillota and Pseudomonadota phyla in both airway and gut microbiomes.Conclusion: No compelling evidence could be found between adult antibiotic exposure, consequent microbial dysbiosis, and allergic conditions in adults. Our review is limited by scarce literature and therefore remains inconclusive. However, potential implications of antibiotic use impacting on allergic conditions justify additional research and heightened pharmacovigilance in this area.Keywords: atopy, allergic, asthma, microbiome
Published: 2023

45. Semi-physiological Enriched Population Pharmacokinetic Modelling to Predict the Effects of Pregnancy on the Pharmacokinetics of Cytotoxic Drugs

Author: Janssen, J. M., Damoiseaux, D., van Hasselt, J. G. C., Amant, F. C. H., van Calsteren, K., Beijnen, J. H., Huitema, A. D. R., and Dorlo, T. P. C.
Published: 2023
Full Text: View/download PDF

46. Integrative model-based comparison of target site-specific antimicrobial effects: A case study with ceftaroline and lefamulin

Author: van Os, Wisse, Pham, Anh Duc, Eberl, Sabine, Minichmayr, Iris K., van Hasselt, J.G. Coen, and Zeitlinger, Markus
Published: 2024
Full Text: View/download PDF

47. Mono-allelic KCNB2 variants lead to a neurodevelopmental syndrome caused by altered channel inactivation

Author: Bhat, Shreyas, Rousseau, Justine, Michaud, Coralie, Lourenço, Charles Marques, Stoler, Joan M., Louie, Raymond J., Clarkson, Lola K., Lichty, Angie, Koboldt, Daniel C., Reshmi, Shalini C., Sisodiya, Sanjay M., Hoytema van Konijnenburg, Eva M.M., Koop, Klaas, van Hasselt, Peter M., Démurger, Florence, Dubourg, Christèle, Sullivan, Bonnie R., Hughes, Susan S., Thiffault, Isabelle, Tremblay, Elisabeth Simard, Accogli, Andrea, Srour, Myriam, Blunck, Rikard, and Campeau, Philippe M.
Published: 2024
Full Text: View/download PDF

48. Severity-adjusted evaluation of liver transplantation on health outcomes in urea cycle disorders

Author: Mew, Nicholas Ah, Seminara, Jennifer, Burrage, Lindsay C., Berry, Gerard T., Breilyn, Margo, Schulze, Andreas, Harding, Cary O., Berry, Susan A., Wong, Derek, McCandless, Shawn E., Baumgartner, Matthias R., Konczal, Laura, Ficicioglu, Can, Diaz, George A., Coughlin, Curtis R., 2nd, Enns, Gregory M., Gallagher, Renata C., Lam, Christina, Stricker, Tamar, Wilkening, Greta, Dionisi-Vici, Carlo, Dobbelaere, Dries, Blasco-Alonso, Javier, Burlina, Alberto B., Freisinger, Peter, van Hasselt, Peter M., Skouma, Anastasia, Lund, Allan M., Vara, Roshni, Sarajlija, Adrijan, Morris, Andrew A., Chakrapani, Anupam, Barić, Ivo, Augoustides-Savvopoulou, Persephone, Chien, Yin-Hsiu, Cortès-Saladelafont, Elisenda, Eyskens, Francois, Gramer, Gwendolyn, Zeman, Jiri, Karall, Daniela, Couce, Maria L., Mühlhausen, Chris, Pedrón-Giner, Consuelo, Spiekerkoetter, Ute, Sykut-Cegielska, Jolanta, Wagenmakers, Margreet, Wijburg, Frits A., Posset, Roland, Garbade, Sven F., Gleich, Florian, Scharre, Svenja, Okun, Jürgen G., Gropman, Andrea L., Nagamani, Sandesh C.S., Druck, Ann-Catrin, Epp, Friederike, Hoffmann, Georg F., Kölker, Stefan, and Zielonka, Matthias
Published: 2024
Full Text: View/download PDF

49. Targeting Nrf2 to treat thyroid cancer

Author: Gong, Zhongqin, Xue, Lingbin, Li, Huangcan, Fan, Simiao, van Hasselt, Charles Andrew, Li, Dongcai, Zeng, Xianhai, Tong, Michael Chi Fai, and Chen, George Gong
Published: 2024
Full Text: View/download PDF

50. UPLC-Orbitrap-HRMS application for analysis of plasma sterols

Author: van der Ham, Maria, Gerrits, Johan, Prinsen, Berthil, van Hasselt, Peter, Fuchs, Sabine, Jans, Judith, Willems, Anke, and de Sain-van der Velden, Monique
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

6,132 results on '"Van Hasselt A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources