Search

Your search keyword '"Lin, Dahua"' showing total 870 results

Search Constraints

Start Over You searched for: Author "Lin, Dahua" Remove constraint Author: "Lin, Dahua"
870 results on '"Lin, Dahua"'

Search Results

1. MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

2. PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

3. InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems

4. SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

5. Training Language Models to Critique With Multi-agent Feedback

6. VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

7. ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

8. LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

9. Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

10. Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

11. BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way

12. MinerU: An Open-Source Solution for Precise Document Content Extraction

13. Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation

14. Scaling Behavior for Large Language Models regarding Numeral Systems: An Example using Pythia

15. 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

16. What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices

17. UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

18. CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis

19. LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

20. Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

21. HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation

22. LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover

23. OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection

24. Case2Code: Learning Inductive Reasoning with Synthetic Data

25. VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

26. CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

27. GRUtopia: Dream General Robots in a City at Scale

28. VEnhancer: Generative Space-Time Enhancement for Video Generation

29. Rethinking Image-to-Video Adaptation: An Object-centric Perspective

30. Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

31. ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

32. InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

33. InternLM-Law: An Open Source Chinese Legal Large Language Model

34. Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

35. MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

36. OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

37. SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

38. MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

39. V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

40. MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

41. OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

42. Uncertainty Aware Learning for Language Model Alignment

43. ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

44. Lean Workbook: A large-scale Lean problem set formalized from natural language math problems

45. OpenDataLab: Empowering General Artificial Intelligence with Open Datasets

46. Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

47. ANAH: Analytical Annotation of Hallucinations in Large Language Models

48. AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

49. DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data

50. Streaming Long Video Understanding with Large Language Models

Catalog

Books, media, physical & digital resources