Search

Your search keyword '"Baraldi, Lorenzo"' showing total 25 results

Search Constraints

Start Over You searched for: Author "Baraldi, Lorenzo" Remove constraint Author: "Baraldi, Lorenzo" Topic computer science - computation and language Remove constraint Topic: computer science - computation and language
25 results on '"Baraldi, Lorenzo"'

Search Results

1. Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training

2. Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

3. BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues

4. Towards Retrieval-Augmented Architectures for Image Captioning

5. Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs

6. AIGeN: An Adversarial Approach for Instruction Generation in VLN

7. The Revolution of Multimodal Large Language Models: A Survey

8. Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

9. With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning

10. Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation

11. Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation

12. Embodied Agents for Efficient Exploration and Smart Scene Description

13. ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval

14. Retrieval-Augmented Transformer for Image Captioning

15. CaMEL: Mean Teacher Learning for Image Captioning

16. Universal Captioner: Inducing Content-Style Separation in Vision-and-Language Model Training

17. Working Memory Connections for LSTM

18. From Show to Tell: A Survey on Deep Learning-based Image Captioning

19. Learning to Select: A Fully Attentive Approach for Novel Object Captioning

20. Explore and Explain: Self-supervised Navigation and Recounting

21. A Novel Attention-based Aggregation Function to Combine Vision and Language

22. Meshed-Memory Transformer for Image Captioning

23. Multimodal Attention Networks for Low-Level Vision-and-Language Navigation

24. SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability

25. Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions

Catalog

Books, media, physical & digital resources