404 results on '"LIJUAN WANG"'
Search Results
2. MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh Reconstruction.
3. Block diagonal representation learning with local invariance for face clustering.
4. Multimodal Foundation Models: From Specialists to General-Purpose Assistants.
5. Optimization of Electrostatic Sensors for Rotational Speed Measurement of a Metallic Rotor.
6. StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.
7. MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities.
8. Completing Visual Objects via Bridging Generation and Segmentation.
9. Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning.
10. Cross-border Commodity Pricing Strategy Optimization via Mixed Neural Network for Time Series Analysis.
11. MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities.
12. AutoDirector: Online Auto-scheduling Agents for Multi-sensory Composition.
13. IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation.
14. Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness.
15. VideoGUI: A Benchmark for GUI Automation from Instructional Videos.
16. StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis.
17. MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos.
18. Bring Metric Functions into Diffusion Models.
19. Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation.
20. List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs.
21. Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition.
22. Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning.
23. COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training.
24. The landscape of the methodology in drug repurposing using human genomic data: a systematic review.
25. Equivariant Similarity for Vision-Language Foundation Models.
26. An Empirical Study of Multimodal Model Merging.
27. Optimization of Electrostatic Sensors for Rotational Speed Measurement.
28. Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network.
29. Adaptive Human Matting for Dynamic Videos.
30. Non-Contrastive Learning Meets Language-Image Pre-Training.
31. Neural Voting Field for Camera-Space 3D Hand Pose Estimation.
32. An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling.
33. Generalized Decoding for Pixel, Image, and Language.
34. LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling.
35. ReCo: Region-Controlled Text-to-Image Generation.
36. Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.
37. NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
38. Research on the Construction of Quality Evaluation System for Cultivation of Excellent Engineers Based on AHP-Grey Fuzzy Method.
39. Zero-Shot Human-Object Interaction (HOI) Classification by Bridging Generative and Contrastive Image-Language Models.
40. Density peaks clustering algorithm based on improved similarity and allocation strategy.
41. Measurement of Cross-Sectional Velocity Distribution of Pneumatically Conveyed Particles in a Square-Shaped Pipe Through Gaussian Process Regression-Assisted Nonrestrictive Electrostatic Sensing.
42. Detection of Antarctic Surface Meltwater Using Sentinel-2 Remote Sensing Images via U-Net With Attention Blocks: A Case Study Over the Amery Ice Shelf.
43. Facial age estimation based on asymmetrical label distribution.
44. Estimation of the Bio-Parameters of Winter Wheat by Combining Feature Selection with Machine Learning Using Multi-Temporal Unmanned Aerial Vehicle Multispectral Images.
45. D2S: Dynamic Distribution Supervision for Multi-Label Facial Expression Recognition.
46. Instantaneous Rotational Speed Measurement of Wind Turbine Blades using a Marker-Tracking Method.
47. Measurement of cross-sectional velocity distribution of pneumatically conveyed particles in a square-shaped pipe through electrostatic sensing and Gaussian process regression.
48. Scaling Up Vision-Language Pretraining for Image Captioning.
49. Injecting Semantic Concepts into End-to-End Image Captioning.
50. Crossmodal Representation Learning for Zero-shot Action Recognition.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.