Search

Your search keyword '"Song, Guanglu"' showing total 147 results

Search Constraints

Start Over You searched for: Author "Song, Guanglu" Remove constraint Author: "Song, Guanglu"
147 results on '"Song, Guanglu"'

Search Results

1. Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning

2. MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

3. Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

4. Phased Consistency Model

5. Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

6. MoVA: Adapting Mixture of Vision Experts to Multimodal Context

7. Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

8. CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

9. Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

10. Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

11. FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

12. AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data

13. Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

14. Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediction Tasks

15. Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation

16. Towards Large-scale Masked Face Recognition

17. Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection

18. RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

19. Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

20. Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

21. DETRs with Collaborative Hybrid Assignments Training

22. Teach-DETR: Better Training DETR with Teachers

23. Large-batch Optimization for Dense Visual Predictions

25. Towards Robust Face Recognition with Comprehensive Search

26. Unifying Visual Perception by Dispersible Points Learning

27. Rethinking Robust Representation Learning Under Fine-grained Noisy Faces

28. UniNet: Unified Architecture Search with Convolution, Transformer, and MLP

29. UniFormer: Unifying Convolution and Self-attention for Visual Recognition

30. UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning

31. Self-slimmed Vision Transformer

32. INTERN: A New Learning Paradigm Towards General Vision

33. UniNet: Unified Architecture Search with Convolution, Transformer, and MLP

34. FNAS: Uncertainty-Aware Fast Neural Architecture Search

35. Discriminability Distillation in Group Representation Learning

36. 1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020

37. Revisiting the Sibling Head in Object Detector

38. 1st Place Solutions for OpenImage2019 -- Object Detection and Instance Segmentation

39. KPNet: Towards Minimal Face Detector

40. Top-1 Solution of Multi-Moments in Time Challenge 2019

41. Towards Flops-constrained Face Recognition

42. Self-slimmed Vision Transformer

43. Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

44. Region-based Quality Estimation Network for Large-scale Person Re-identification

45. Scale Semantic Flow Preserving Across Image Pyramid

50. Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

Catalog

Books, media, physical & digital resources