Search

Your search keyword '"Rong, P"' showing total 456 results

Search Constraints

Start Over You searched for: Author "Rong, P" Remove constraint Author: "Rong, P" Topic computer science - computer vision and pattern recognition Remove constraint Topic: computer science - computer vision and pattern recognition
456 results on '"Rong, P"'

Search Results

1. Generative Visual Commonsense Answering and Explaining with Generative Scene Graph Constructing

2. Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models

3. SplatMAP: Online Dense Monocular SLAM with 3D Gaussian Splatting

4. Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images

5. CFFormer: Cross CNN-Transformer Channel Attention and Spatial Feature Fusion for Improved Segmentation of Low Quality Medical Images

6. Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

7. Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction

8. MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception

9. JADE: Joint-aware Latent Diffusion for 3D Human Generative Modeling

10. RealisID: Scale-Robust and Fine-Controllable Identity Customization via Local and Global Complementation

11. Trusted Mamba Contrastive Network for Multi-View Clustering

12. {S$^3$-Mamba}: Small-Size-Sensitive Mamba for Lesion Segmentation

13. Progressive Multimodal Reasoning via Active Retrieval

14. Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-Localization

15. AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration

16. IGR: Improving Diffusion Model for Garment Restoration from Person Image

17. Distribution-Consistency-Guided Multi-modal Hashing

18. SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models

19. SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers

20. Copy-Move Detection in Optical Microscopy: A Segmentation Network and A Dataset

21. SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding

22. INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

23. Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing

24. SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

25. Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark

26. AnimateAnything: Consistent and Controllable Animation for Video Generation

27. Normative Modeling for AD Diagnosis and Biomarker Identification

28. Face De-identification: State-of-the-art Methods and Comparative Studies

29. Elucidating the design space of language models for image generation

30. BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities

31. Exploring the Design Space of Visual Context Representation in Video MLLMs

32. VividMed: Vision Language Model with Versatile Visual Grounding for Medicine

33. On-the-fly Modulation for Balanced Multimodal Learning

34. The Roles of Contextual Semantic Relevance Metrics in Human Visual Processing

35. TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text

36. GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling

37. Mixture of Prompt Learning for Vision Language Models

38. TTT-Unet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation

39. Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video

40. Visual Grounding with Multi-modal Conditional Adaptation

41. Fisheye-GS: Lightweight and Extensible Gaussian Splatting Module for Fisheye Cameras

42. PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation

43. A New People-Object Interaction Dataset and NVS Benchmarks

44. Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion

45. ActionPose: Pretraining 3D Human Pose Estimation with the Dark Knowledge of Action

46. RING#: PR-by-PE Global Localization with Roto-translation Equivariant Gram Learning

47. FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry

48. GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars

49. MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

50. CathAction: A Benchmark for Endovascular Intervention Understanding

Catalog

Books, media, physical & digital resources