Search

Your search keyword '"Duan, Nan"' showing total 23 results

Search Constraints

Start Over You searched for: Author "Duan, Nan" Remove constraint Author: "Duan, Nan" Topic computer vision and pattern recognition (cs.cv) Remove constraint Topic: computer vision and pattern recognition (cs.cv)
23 results on '"Duan, Nan"'

Search Results

1. GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

2. NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation

3. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

4. Learning 3D Photography Videos via Self-supervised Diffusion on Single Images

5. ReCo: Region-Controlled Text-to-Image Generation

6. An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

7. HORIZON: A High-Resolution Panorama Synthesis Framework

8. CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

9. NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

10. BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning

11. N��WA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

12. DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder

13. KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation

14. CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval

15. N��WA: Visual Synthesis Pre-training for Neural visUal World creAtion

16. GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions

17. M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

18. UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

19. XGPT: Cross-modal Generative Pre-Training for Image Captioning

20. PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

21. Deep Reason: A Strong Baseline for Real-World Visual Reasoning

22. Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training

23. Visual Question Generation as Dual Task of Visual Question Answering

Catalog

Books, media, physical & digital resources