232 results on '"Xiaodan Liang"'
Search Results
2. MLP Can Be a Good Transformer Learner.
3. AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis.
4. Holistic Autonomous Driving Understanding by Bird'View Injected Multi-Modal Large Models.
5. Contrastive Learning with Counterfactual Explanations for Radiology Report Generation.
6. ATG: Benchmarking Automated Theorem Generation for Generative Language Models.
7. CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation.
8. VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models.
9. RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter.
10. CLOMO: Counterfactual Logical Modification with Large Language Models.
11. MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation.
12. Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model.
13. Monocular 3D Hand Mesh Recovery via Dual Noise Estimation.
14. PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping.
15. 3D Visibility-Aware Generalizable Neural Radiance Fields for Interacting Hands.
16. Ins-DetCLIP: Aligning Detection Model to Follow Human-Language Instruction.
17. DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning.
18. LEGO-Prover: Neural Theorem Proving with Growing Libraries.
19. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data.
20. MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation.
21. GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training.
22. CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation.
23. Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images.
24. LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts.
25. DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment.
26. DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability.
27. Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos.
28. FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration.
29. Composable Text Controls in Latent Space with ODEs.
30. TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models.
31. Application of Intelligent Mobile Terminal in Virtual Building Construction Training Teaching.
32. GP-VTON: Towards General Purpose Virtual Try-On via Collaborative Local-Flow Global-Parsing Learning.
33. Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation.
34. CapDet: Unifying Dense Captioning and Open-World Detection Pretraining.
35. CLIP2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data.
36. Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection.
37. Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving.
38. DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment.
39. Learning to Segment Every Referring Object Point by Point.
40. Vision Language Navigation with Knowledge-driven Environmental Dreamer.
41. DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-level Value Function.
42. RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog.
43. 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation.
44. Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation.
45. NLIP: Noise-Robust Language-Image Pre-training.
46. LogicSolver: Towards Interpretable Math Word Problem Solving with Logical Prompt-enhanced Learning.
47. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression.
48. RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection via Relational Contrastive Learning.
49. MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure.
50. Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.