Search

Your search keyword '"Ding, Mingyu"' showing total 494 results

Search Constraints

Start Over You searched for: Author "Ding, Mingyu" Remove constraint Author: "Ding, Mingyu"
494 results on '"Ding, Mingyu"'

Search Results

1. Compositional Physical Reasoning of Objects and Events from Videos

2. WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

3. Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

4. RoadBEV: Road Surface Reconstruction in Bird's Eye View

5. A road surface reconstruction dataset for autonomous driving.

6. Q-SLAM: Quadric Representations for Monocular SLAM

7. DrPlanner: Diagnosis and Repair of Motion Planners for Automated Vehicles Using Large Language Models

8. PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models

9. RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

10. RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation

11. Depth-aware Volume Attention for Texture-less Stereo Matching

12. SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

13. A Survey of Reasoning with Foundation Models

14. Interfacing Foundation Models' Embeddings

15. EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

16. Open X-Embodiment: Robotic Learning Datasets and RT-X Models

17. Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

18. TextPSG: Panoptic Scene Graph Generation from Textual Descriptions

19. LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

20. Human-oriented Representation Learning for Robotic Manipulation

21. Generalizable Long-Horizon Manipulations with Large Language Models

22. RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving

23. Towards Free Data Selection with General-Purpose Models

24. Pre-training on Synthetic Driving Data for Trajectory Prediction

25. An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training

26. Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

27. Doubly Robust Self-Training

28. EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought

29. VDT: General-purpose Video Diffusion Transformers via Mask Modeling

30. Quadric Representations for LiDAR Odometry, Mapping and Localization

31. EC^2: Emergent Communication for Embodied Control

32. Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following

33. Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

36. Planning with Large Language Models for Code Generation

37. UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

38. AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

39. Understanding Self-Supervised Pretraining with Part-Aware Representation Learning

40. Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

41. Physion++: Evaluating Physical Scene Understanding with Objects Consisting of Different Physical Attributes in Humans and Machines

42. NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields

43. LGDN: Language-Guided Denoising Network for Video-Language Modeling

44. Multimodal foundation models are better simulators of the human brain

46. CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer

47. ComPhy: Compositional Physical Reasoning of Objects and Events from Videos

48. DaViT: Dual Attention Vision Transformers

49. Context Autoencoder for Self-Supervised Representation Learning

Catalog

Books, media, physical & digital resources