26 results on '"Zhizhou, Ren"'
Search Results
2. DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data.
3. Full-Atom Peptide Design based on Multi-modal Flow Matching.
4. Bridging distribution gaps: invariant pattern discovery for dynamic graph learning.
5. Off-Policy Reinforcement Learning with Delayed Rewards.
6. Self-Organized Polynomial-Time Coordination Graphs.
7. Proximal Exploration for Model-guided Protein Sequence Design.
8. Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization.
9. On the Estimation Bias in Double Q-Learning.
10. Generalizable Episodic Memory for Deep Reinforcement Learning.
11. Object-Oriented Dynamics Learning through Multi-Level Abstraction.
12. Learning Long-Term Reward Redistribution via Randomized Return Decomposition.
13. Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation.
14. Exploration via Hindsight Goal Generation.
15. Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation.
16. Mid-Level Data Fusion Combined with the Fingerprint Region for Classification DON Levels Defect of Fusarium Head Blight Wheat.
17. QPLEX: Duplex Dueling Multi-Agent Q-Learning.
18. Learning Long-Term Reward Redistribution via Randomized Return Decomposition.
19. Generalizable Episodic Memory for Deep Reinforcement Learning.
20. Off-Policy Reinforcement Learning with Delayed Rewards.
21. On the Estimation Bias in Double Q-Learning.
22. Self-Organized Polynomial-Time Coordination Graphs.
23. QPLEX: Duplex Dueling Multi-Agent Q-Learning.
24. Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning.
25. Exploration via Hindsight Goal Generation.
26. Object-Oriented Dynamics Learning through Multi-Level Abstraction.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.