Search

Your search keyword '"Yang, Zhuoran"' showing total 647 results

Search Constraints

Start Over You searched for: Author "Yang, Zhuoran" Remove constraint Author: "Yang, Zhuoran"
647 results on '"Yang, Zhuoran"'

Search Results

206. One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

207. False Correlation Reduction for Offline Reinforcement Learning

208. Understanding Implicit Regularization in Over-Parameterized Single Index Model.

209. Online Bootstrap Inference For Policy Evaluation In Reinforcement Learning.

229. Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games

230. Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes

231. Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

232. Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

233. Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation

234. The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches

235. Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

236. Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

237. Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

238. Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

239. Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

240. Offline Policy Optimization in RL with Variance Regularizaton

241. Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

242. Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

243. Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes

244. Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

245. Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets

250. Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

Catalog

Books, media, physical & digital resources