Search

Your search keyword '"LUO Ping"' showing total 134 results

Search Constraints

Start Over You searched for: Author "LUO Ping" Remove constraint Author: "LUO Ping" Topic fos: computer and information sciences Remove constraint Topic: fos: computer and information sciences
134 results on '"LUO Ping"'

Search Results

1. InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

2. Align, Adapt and Inject: Sound-guided Unified Image Generation

3. Scene as Occupancy

4. RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

5. VDT: An Empirical Study on Video Diffusion with Transformers

6. VideoChat: Chat-Centric Video Understanding

7. InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

8. DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

9. Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

10. DDP: Diffusion Model for Dense Visual Prediction

11. Real-time Controllable Denoising for Image and Video

12. Accelerating Vision-Language Pretraining with Free Language Modeling

13. Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

14. Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

15. Multi-Level Contrastive Learning for Dense Prediction Task

16. LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models

17. Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following

18. GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

19. EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought

20. Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning

21. Vehicle-Infrastructure Cooperative 3D Object Detection via Feature Flow Prediction

22. AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

23. ChiPFormer: Transferable Chip Placement via Offline Decision Transformer

24. MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

25. VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

26. Dense Distinct Query for End-to-End Object Detection

27. Topology Reasoning for Driving Scenes

28. SyNDock: N Rigid Protein Docking via Learnable Group Synchronization

29. EGC: Image Generation and Classification via a Diffusion Energy-Based Model

30. MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

31. EC^2: Emergent Communication for Embodied Control

32. Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

33. DiffRate : Differentiable Compression Rate for Efficient Vision Transformers

34. RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

35. $π$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

36. Going Denser with Open-Vocabulary Part Segmentation

37. Universal Instance Perception as Object Discovery and Retrieval

38. Policy Adaptation from Foundation Model Feedback

39. Prototypical context-aware dynamics generalization for high-dimensional model-based reinforcement learning

40. Large-batch Optimization for Dense Visual Predictions

41. Polygon-Free: Unconstrained Scene Text Detection with Box Annotations

42. Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

43. FedVeca: Federated Vectorized Averaging on Non-IID Data with Adaptive Bi-directional Global Objective

44. Rethinking Resolution in the Context of Efficient Video Recognition

45. Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe

46. PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation

47. 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal

48. CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer

49. VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

50. CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving

Catalog

Books, media, physical & digital resources