Search

Your search keyword '"Zhai, Xiaohua"' showing total 185 results

Search Constraints

Start Over You searched for: Author "Zhai, Xiaohua" Remove constraint Author: "Zhai, Xiaohua"
185 results on '"Zhai, Xiaohua"'

Search Results

1. PaliGemma 2: A Family of Versatile VLMs for Transfer

2. PaliGemma: A versatile 3B VLM for transfer

3. Toward a Diffusion-Based Generalist for Dense Vision Tasks

4. No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models

5. LocCa: Visual Pretraining with Location-aware Captioners

6. CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?

7. SILC: Improving Vision Language Pretraining with Self-distillation

8. SILC: Improving Vision Language Pretraining with Self-Distillation

9. PaLI-3 Vision Language Models: Smaller, Faster, Stronger

10. Image Captioners Are Scalable Vision Learners Too

11. PaLI-X: On Scaling up a Multilingual Vision and Language Model

12. Three Towers: Flexible Contrastive Learning with Pretrained Image Models

13. Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

14. A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

15. Sigmoid Loss for Language Image Pre-Training

16. Tuning computer vision models with task rewards

17. Scaling Vision Transformers to 22 Billion Parameters

19. FlexiViT: One Model for All Patch Sizes

20. PaLI: A Jointly-Scaled Multilingual Language-Image Model

21. Revisiting Neural Scaling Laws in Language and Vision

22. UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

23. Simple Open-Vocabulary Object Detection with Vision Transformers

24. Better plain ViT baselines for ImageNet-1k

25. A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

26. LiT: Zero-Shot Transfer with Locked-image text Tuning

27. How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

28. Revisiting the Calibration of Modern Neural Networks

29. Knowledge distillation: A good teacher is patient and consistent

30. Scaling Vision Transformers

31. MLP-Mixer: An all-MLP Architecture for Vision

32. SI-Score: An image dataset for fine-grained analysis of robustness to object location, rotation and size

33. Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark

34. Underspecification Presents Challenges for Credibility in Modern Machine Learning

35. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

36. Training general representations for remote sensing using in-domain knowledge

37. On Robustness and Transferability of Convolutional Neural Networks

38. Are we done with ImageNet?

39. Big Transfer (BiT): General Visual Representation Learning

40. Self-Supervised Learning of Video-Induced Visual Invariances

41. In-domain representation learning for remote sensing

42. A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

43. S4L: Self-Supervised Semi-Supervised Learning

44. High-Fidelity Image Generation With Fewer Labels

45. Revisiting Self-Supervised Visual Representation Learning

46. A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation

47. Simple Open-Vocabulary Object Detection

48. Self-Supervised GANs via Auxiliary Rotation Loss

49. Self-Supervised GAN to Counter Forgetting

50. A Large-Scale Study on Regularization and Normalization in GANs

Catalog

Books, media, physical & digital resources