Search

Your search keyword '"Kuehne, Hilde"' showing total 175 results

Search Constraints

Start Over You searched for: Author "Kuehne, Hilde" Remove constraint Author: "Kuehne, Hilde"
175 results on '"Kuehne, Hilde"'

Search Results

1. State-Space Large Audio Language Models

2. Teaching VLMs to Localize Specific Objects from In-context Examples

3. Convolutional Differentiable Logic Gate Networks

4. Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms

5. MaskInversion: Localized Embeddings via Optimization of Explainability Maps

6. DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners

7. Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

8. ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

9. LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

10. Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

11. Uncertainty Quantification via Stable Distribution Propagation

12. Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs

13. Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

14. Learning Human Action Recognition Representations Without Real Humans

15. HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

16. In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval

17. Preserving Modality Structure Improves Multi-Modal Learning

18. What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation

19. Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

20. ISAAC Newton: Input-based Approximate Curvature for Newton's Method

21. Learning Situation Hyper-Graphs for Video Question Answering

22. WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition

23. What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

24. Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data

25. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

26. TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering

27. Learning by Sorting: Self-supervised Learning with Group Ordering Constraints

28. Video Test-Time Adaptation for Action Recognition

29. Deep Differentiable Logic Gate Networks

30. C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

31. Contrastive Audio-Visual Masked Autoencoder

32. VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models

33. Augmentation Learning for Semi-Supervised Classification

34. Weakly Supervised Grounding for VQA in Vision-Language Transformers

35. Differentiable Top-k Classification Learning

36. CycDA: Unsupervised Cycle Domain Adaptation from Image to Video

37. Monotonic Differentiable Sorting Networks

38. Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

39. Unsupervised Domain Generalization by Learning a Bridge Across Domains

40. Routing with Self-Attention for Multimodal Capsule Networks

41. Cascaded Multilingual Audio-Visual Learning from Videos

42. Style Agnostic 3D Reconstruction via Adversarial Style Transfer

43. Learning with Algorithmic Supervision via Continuous Relaxations

44. Generalized and Incremental Few-Shot Learning by Explicit Learning and Calibration without Forgetting

45. Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

46. Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision

47. Unsupervised Discriminative Embedding for Sub-Action Learning in Complex Activities

48. Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

49. Detector-Free Weakly Supervised Grounding by Separation

50. AVLnet: Learning Audio-Visual Language Representations from Instructional Videos

Catalog

Books, media, physical & digital resources