Search

Your search keyword '"Policy Gradient"' showing total 429 results

Search Constraints

Start Over You searched for: Descriptor "Policy Gradient" Remove constraint Descriptor: "Policy Gradient"
429 results on '"Policy Gradient"'

Search Results

1. Convergence of a L2 Regularized Policy Gradient Algorithm for the Multi Armed Bandit

2. The RL Toolkit: A Spectrum of Algorithms

3. A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation.

4. Relabeling and policy distillation of hierarchical reinforcement learning.

5. HG-search: multi-stage search for heterogeneous graph neural networks.

6. Experimental Implementation of a TD3 Agent Based Speed Controller for Direct Torque Control of PMSM Drives.

7. Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds.

8. CONVERGENCE OF ENTROPY-REGULARIZED NATURAL POLICY GRADIENT WITH LINEAR FUNCTION APPROXIMATION.

9. Investigating the Efficacy of Deep Reinforcement Learning Models in Detecting and Mitigating Cyber-attacks: a Novel Approach.

10. 基于强化学习的多智能体协同 电子对抗方法.

11. Anti-conflict AGV path planning in automated container terminals based on multi-agent reinforcement learning.

12. Landscape Analysis of Stochastic Policy Gradient Methods

13. Enhancing Adversarial Robustness for Deep Metric Learning via Attention-Aware Knowledge Guidance

14. A Reinforcement Learning Framework for Lung Segmentation of COVID-19 and Pneumonia Affected Chest X-Ray Image

16. Enhancing Policy Gradient for Traveling Salesman Problem with Data Augmented Behavior Cloning

19. Reinforce Model Tracklet for Multi-Object Tracking

20. List-Based Workflow Scheduling Utilizing Deep Reinforcement Learning

22. 策略梯度的超启发算法求解带容量约束车辆路径问题.

23. Reinforcement learning with dynamic convex risk measures.

24. Optimization of Reinforcement Learning Using Quantum Computation

25. Optimal Power Allocation in Optical GEO Satellite Downlinks Using Model-Free Deep Learning Algorithms.

26. Adaptive bias-variance trade-off in advantage estimator for actor–critic algorithms.

27. Vision-based control in the open racing car simulator with deep and reinforcement learning.

28. Reinforcement learning-based cost-sensitive classifier for imbalanced fault classification.

29. FeMIP: detector-free feature matching for multimodal images with policy gradient.

30. 用于连续时间中策略梯度算法的 动作稳定更新算法.

31. Combining Neural Networks with Logic Rules.

32. Modeling limit order trading with a continuous action policy for deep reinforcement learning.

33. Reinforced mixture learning.

34. Regret Analysis of a Markov Policy Gradient Algorithm for Multiarm Bandits.

35. Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games.

36. BLOCK POLICY MIRROR DESCENT.

37. Credit assignment with predictive contribution measurement in multi-agent reinforcement learning.

38. DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning

39. Standardising policy and technology responses in the immediate aftermath of a pandemic: a comparative and conceptual framework

40. Policy Gradient for Arabic to English Neural Machine Translation

41. RLPassGAN: Password Guessing Model Based on GAN with Policy Gradient

42. Policy Gradient Reinforcement Learning Method for Backward Motion Control of Tractor-Trailer Mobile Robot

43. An Open Domain Question Answering System Trained by Reinforcement Learning

44. Robust reinforcement learning algorithm based on pigeon-inspired optimization

45. A task allocation algorithm based on reinforcement learning in spatio-temporal crowdsourcing.

46. Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics.

47. Decentralized multi-task reinforcement learning policy gradient method with momentum over networks.

48. Multi-label sequence generating model via label semantic attention mechanism.

49. Reinforcement Learning-Based Approach for Minimizing Energy Loss of Driving Platoon Decisions †.

50. Human Pathogenic Monkeypox Disease Recognition Using Q-Learning Approach.

Catalog

Books, media, physical & digital resources