Search

Your search keyword '"Zhang, Kaiqing"' showing total 349 results

Search Constraints

Start Over You searched for: Author "Zhang, Kaiqing" Remove constraint Author: "Zhang, Kaiqing"
349 results on '"Zhang, Kaiqing"'

Search Results

1. Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

2. RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation

3. Do LLM Agents Have Regret? A Case Study in Online Learning and Games

4. Orthorhombic metal carbide-borides MeC$_2$B$_{12}$ (Me=Mg, Ca, Sr) from first principles: structure, stability and mechanical properties

5. Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

6. Robot Fleet Learning via Policy Merging

7. Partially Observable Multi-Agent Reinforcement Learning with Information Sharing

8. Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

9. Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

10. Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs

11. Self-Supervised Reinforcement Learning that Transfers using Random Features

12. Learning to Extrapolate: A Transductive Approach

13. A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

14. Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation

15. Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?

16. Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation

17. An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods

18. Mechanical properties of AlMgB14-related boron carbide structures. A first principle study

19. Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

20. Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?

21. Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

22. The Power of Regularization in Solving Extensive-Form Games

23. What is a Good Metric to Study Generalization of Minimax Learners?

24. Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

25. Byzantine-Robust Online and Offline Distributed Reinforcement Learning

26. Fictitious Play in Markov Games with Single Controller

27. Design and Commissioning of A Beam Distribution System for Multiple Undulator Line Operation of the SXFEL-UF

28. The Complexity of Markov Equilibrium in Stochastic Games

30. Globally Convergent Policy Search over Dynamic Filters for Output Estimation

31. Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence

32. Do Differentiable Simulators Give Better Policy Gradients?

33. Remote optogenetic control of the enteric nervous system and brain-gut axis in freely-behaving mice enabled by a wireless, battery-free optoelectronic device

34. Independent Learning in Stochastic Games

35. On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning

39. Decentralized Q-Learning in Zero-sum Markov Games

41. Learning Safe Multi-Agent Control with Decentralized Neural Barrier Certificates

42. Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity

43. Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup

44. Self-Amplification of Coherent Energy Modulation in Seeded Free-Electron Lasers

45. Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control

46. Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games

47. Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

48. POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

49. Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

50. Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games

Catalog

Books, media, physical & digital resources