Search

Your search keyword '"ZHANG Wenwei"' showing total 46 results

Search Constraints

Start Over You searched for: Author "ZHANG Wenwei" Remove constraint Author: "ZHANG Wenwei" Topic computer science - computer vision and pattern recognition Remove constraint Topic: computer science - computer vision and pattern recognition
46 results on '"ZHANG Wenwei"'

Search Results

1. 4D Contrastive Superflows are Dense 3D Representation Learners

2. InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

3. ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities

4. F-LMM: Grounding Frozen Large Multimodal Models

5. Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving

6. An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models

7. The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

8. Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

9. InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

10. Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

11. InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

12. EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

13. CLIM: Contrastive Language-Image Mosaic for Region Representation

14. Mixed Pseudo Labels for Semi-Supervised Object Detection

15. OV-PARTS: Towards Open-Vocabulary Part Segmentation

16. CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

17. DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

18. InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

19. Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

20. Unified Human-Scene Interaction via Prompted Chain-of-Contacts

21. GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

22. Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

23. MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

24. Transformer-Based Visual Segmentation: A Survey

25. RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

26. Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

27. Position-Guided Point Cloud Panoptic Segmentation Transformer

28. MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

29. Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation

30. Dense Distinct Query for End-to-End Object Detection

31. Aligning Bag of Regions for Open-Vocabulary Object Detection

32. RTMDet: An Empirical Study of Designing Real-Time Object Detectors

33. MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

34. MMRotate: A Rotated Object Detection Benchmark using PyTorch

35. Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

36. Dense Siamese Network for Dense Unsupervised Learning

37. MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

38. K-Net: Towards Unified Image Segmentation

39. Exploring Data Augmentation for Multi-Modality 3D Object Detection

40. Seesaw Loss for Long-Tailed Instance Segmentation

41. More Information Supervised Probabilistic Deep Face Embedding Learning

42. Deep Frequent Spatial Temporal Learning for Face Anti-Spoofing

43. EcoNAS: Finding Proxies for Economical Neural Architecture Search

44. Side-Aware Boundary Localization for More Precise Object Detection

45. Robust Multi-Modality Multi-Object Tracking

46. Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation

Catalog

Books, media, physical & digital resources