Search

Your search keyword '"Qiao, Yu"' showing total 7,515 results

Search Constraints

Start Over You searched for: Author "Qiao, Yu" Remove constraint Author: "Qiao, Yu"
7,515 results on '"Qiao, Yu"'

Search Results

1. GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

2. GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI

3. VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

4. OASIS: Open Agent Social Interaction Simulations with One Million Agents

5. MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map

6. Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

7. ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving

8. OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

9. TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

10. DeMuVGN: Effective Software Defect Prediction Model by Learning Multi-view Software Dependency via Graph Neural Networks

11. FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

12. DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes

13. An Intelligent Agentic System for Complex Image Restoration Problems

14. Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

15. Diffusion Transformer Policy

16. REEF: Representation Encoding Fingerprints for Large Language Models

17. TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration

18. SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding

19. Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

20. Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

21. ToMiE: Towards Modular Growth in Enhanced SMPL Skeleton for 3D Human with Animatable Garments

22. Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

23. Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

24. MinerU: An Open-Source Solution for Precise Document Content Extraction

25. Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving

26. Inference-Time Language Model Alignment via Integrated Value Guidance

27. CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation

28. PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

29. Boosting Federated Domain Generalization: Understanding the Role of Advanced Pre-Trained Architectures

30. GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction

31. A Preliminary Exploration Towards General Image Restoration

32. MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration

33. Learning A Low-Level Vision Generalist via Visual Task Prompt

34. GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

35. VisionUnite: A Vision-Language Foundation Model for Ophthalmology Enhanced with Clinical Knowledge

36. MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

37. Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

38. DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving

39. Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

40. Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

41. The Shadow of Fraud: The Emerging Danger of AI-powered Social Engineering and its Possible Cure

42. MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

43. ViLLa: Video Reasoning Segmentation with Large Language Model

44. The Better Angels of Machine Personality: How Personality Relates to LLM Safety

45. Navigating the Data Trading Crossroads: An Interdisciplinary Survey

46. GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity

47. Building Intelligence Identification System via Large Language Model Watermarking: A Survey and Beyond

48. GRUtopia: Dream General Robots in a City at Scale

49. Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification

50. VEnhancer: Generative Space-Time Enhancement for Video Generation

Catalog

Books, media, physical & digital resources