Search

Your search keyword '"LUO Ping"' showing total 115 results

Search Constraints

Start Over You searched for: Author "LUO Ping" Remove constraint Author: "LUO Ping" Topic computer vision and pattern recognition (cs.cv) Remove constraint Topic: computer vision and pattern recognition (cs.cv)
115 results on '"LUO Ping"'

Search Results

1. InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

2. Align, Adapt and Inject: Sound-guided Unified Image Generation

3. Scene as Occupancy

4. RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

5. VDT: An Empirical Study on Video Diffusion with Transformers

6. VideoChat: Chat-Centric Video Understanding

7. InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language

8. DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

9. Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

10. DDP: Diffusion Model for Dense Visual Prediction

11. Real-time Controllable Denoising for Image and Video

12. Accelerating Vision-Language Pretraining with Free Language Modeling

13. Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

14. Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

15. Multi-Level Contrastive Learning for Dense Prediction Task

16. LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models

17. Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following

18. GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

19. EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought

20. Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning

21. Vehicle-Infrastructure Cooperative 3D Object Detection via Feature Flow Prediction

22. MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

23. VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

24. Dense Distinct Query for End-to-End Object Detection

25. Topology Reasoning for Driving Scenes

26. EGC: Image Generation and Classification via a Diffusion Energy-Based Model

27. MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

28. EC^2: Emergent Communication for Embodied Control

29. Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

30. DiffRate : Differentiable Compression Rate for Efficient Vision Transformers

31. RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

32. $π$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation

33. Going Denser with Open-Vocabulary Part Segmentation

34. Universal Instance Perception as Object Discovery and Retrieval

35. Policy Adaptation from Foundation Model Feedback

36. Large-batch Optimization for Dense Visual Predictions

37. Polygon-Free: Unconstrained Scene Text Detection with Box Annotations

38. Rethinking Resolution in the Context of Efficient Video Recognition

39. Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe

40. PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation

41. 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal

42. CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer

43. VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

44. CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving

45. RestoreFormer: High-Quality Blind Face Restoration from Undegraded Key-Value Pairs

46. Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

47. Bridging Video-text Retrieval with Multiple Choice Questions

48. End-to-End Video Text Spotting with Transformer

49. WegFormer: Transformers for Weakly Supervised Semantic Segmentation

50. Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

Catalog

Books, media, physical & digital resources