Author: "Qinyun Tang" / Topic: decision making - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Qinyun Tang"' showing total 2 results

Start Over Author "Qinyun Tang" Topic decision making

2 results on '"Qinyun Tang"'

1. Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.

Author: Sicen Li, Qinyun Tang, Yiming Pang, Xinmeng Ma, and Gang Wang
Subjects: MACHINE learning, REINFORCEMENT learning, SOURCE code, DECISION making, ESTIMATION bias
Abstract: Introduction: The value approximation bias is known to lead to suboptimal policies or catastrophic overestimation bias accumulation that prevent the agent from making the right decisions between exploration and exploitation. Algorithms have been proposed tomitigate the above contradiction. However, we still lack an understanding of how the value bias impact performance and a method for ecient exploration while keeping stable updates. This study aims to clarify the effect of the value bias and improve the reinforcement learning algorithms to enhance sample eciency. Methods: This study designs a simple episodic tabular MDP to research value underestimation and overestimation in actor-critic methods. This study proposes a unified framework called Realistic Actor-Critic (RAC), which employs Universal Value Function Approximators (UVFA) to simultaneously learn policies with different value confidence-bound with the same neural network, each with a different under overestimation trade-off. Results: This study highlights that agents could over-explore low-value states due to inflexible under-overestimation trade-off in the fixed hyperparameters setting, which is a particular form of the exploration-exploitation dilemma. And RAC performs directed exploration without over-exploration using the upper bounds while still avoiding overestimation using the lower bounds. Through carefully designed experiments, this study empirically verifies that RAC achieves 10x sample effciency and 25% performance improvement compared to Soft Actor-Critic in themost challengingHumanoid environment. All the source codes are available at https://github.com/ihuhuhu/RAC. Discussion: This research not only provides valuable insights for research on the exploration-exploitation trade-off by studying the frequency of policies access to low-value states under different value confidence-bounds guidance, but also proposes a new unified framework that can be combined with current actor-critic methods to improve sample efficiency in the continuous control domain. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.

Author: Sicen Li, Qinyun Tang, Yiming Pang, Xinmeng Ma, and Gang Wang
Subjects: MACHINE learning, REINFORCEMENT learning, SOURCE code, DECISION making, ESTIMATION bias
Abstract: Introduction: The value approximation bias is known to lead to suboptimal policies or catastrophic overestimation bias accumulation that prevent the agent from making the right decisions between exploration and exploitation. Algorithms have been proposed to mitigate the above contradiction. However, we still lack an understanding of how the value bias impact performance and a method for efficient exploration while keeping stable updates. This study aims to clarify the effect of the value bias and improve the reinforcement learning algorithms to enhance sample efficiency. Methods: This study designs a simple episodic tabular MDP to research value underestimation and overestimation in actor-critic methods. This study proposes a unified framework called Realistic Actor-Critic (RAC), which employs Universal Value Function Approximators (UVFA) to simultaneously learn policies with different value confidence-bound with the same neural network, each with a different under overestimation trade-off. Results: This study highlights that agents could over-explore low-value states due to inflexible under-overestimation trade-off in the fixed hyperparameters setting, which is a particular form of the exploration-exploitation dilemma. And RAC performs directed exploration without over-exploration using the upper bounds while still avoiding overestimation using the lower bounds. Through carefully designed experiments, this study empirically verifies that RAC achieves lOx sample efficiency and 25% performance improvement compared to Soft Actor-Critic in the most challenging Humanoid environment. All the source codes are available at https://github.com/ihuhuhu/RAC. Discussion: This research not only provides valuable insights for research on the exploration-exploitation trade-off by studying the frequency of policies access to low-value states under different value confidence-bounds guidance, but also proposes a new unified framework that can be combined with current actor-critic methods to improve sample efficiency in the continuous control domain. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"Qinyun Tang"'

1. Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.

2. Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Journal

Database

2 results on '"Qinyun Tang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources