Search

Your search keyword '"LUO Ping"' showing total 256 results

Search Constraints

Start Over You searched for: Author "LUO Ping" Remove constraint Author: "LUO Ping" Publication Year Range This year Remove constraint Publication Year Range: This year
256 results on '"LUO Ping"'

Search Results

1. Evaluation of the impact of climate change on the ecological resistance and ecological corridors based on set pair analysis theory

2. RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version)

3. Federated Prediction-Powered Inference from Decentralized Data

4. Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing

5. HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model

6. MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

7. AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

8. Low-Latency Privacy-Preserving Deep Learning Design via Secure MPC

9. Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

10. Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

11. Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

12. TCFormer: Visual Recognition via Token Clustering Transformer

13. When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset

14. EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

15. IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

16. PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

17. DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

18. Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

19. GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

20. VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks

21. Needle In A Multimodal Haystack

22. Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

23. Uncovering Limitations of Large Language Models in Information Seeking from Tables

24. Learning Manipulation by Predicting Interaction

25. Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View

26. Part123: Part-aware 3D Reconstruction from a Single-view Image

27. AnalogCoder: Analog Circuit Design via Training-Free Code Generation

28. SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge

29. Score-based Generative Models with Adaptive Momentum

30. KET-QA: A Dataset for Knowledge Enhanced Table Question Answering

31. Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

32. Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs

33. UniFS: Universal Few-shot Instance Perception with Point Representations

34. MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

35. Adapting LLaMA Decoder to Vision Transformer

36. End-to-End Autonomous Driving through V2X Cooperation

37. DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

38. ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models

39. FlashFace: Human Image Personalization with High-fidelity Identity Preservation

40. DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

41. Accelerating Federated Learning by Selecting Beneficial Herd of Local Gradients

42. Zero-shot Generative Linguistic Steganography

43. AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Adversarial Visual-Instructions

44. GenAD: Generalized Predictive Model for Autonomous Driving

45. ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine Translation

46. PixArt-\Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

47. RegionGPT: Towards Region Understanding Vision Language Model

48. Position: Towards Implicit Prompt For Text-To-Image Models

49. RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

50. AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Catalog

Books, media, physical & digital resources