Search

Your search keyword '"Tapaswi, Makarand"' showing total 111 results

Search Constraints

Start Over You searched for: Author "Tapaswi, Makarand" Remove constraint Author: "Tapaswi, Makarand"
111 results on '"Tapaswi, Makarand"'

Search Results

1. No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning

2. Major Entity Identification: A Generalizable Alternative to Coreference Resolution

3. VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

4. 'Previously on ...' From Recaps to Story Summarization

5. MICap: A Unified Model for Identity-aware Movie Descriptions

6. NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

7. Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays

8. How you feelin'? Learning Emotions and Mental States in Movie Scenes

9. GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering

10. Test of Time: Instilling Video-Language Models with a Sense of Time

11. Sonus Texere! Automated Dense Soundtrack Construction for Books using Movie Adaptations

12. Can we Adopt Self-supervised Pretraining for Chest X-Rays?

13. Language Conditioned Spatial Relation Reasoning for 3D Object Grounding

14. Unsupervised Audio-Visual Lecture Segmentation

15. Grounded Video Situation Recognition

16. Instruction-driven history-aware policies for robotic manipulations

17. Learning from Unlabeled 3D Environments for Vision-and-Language Navigation

18. Learning Object Manipulation Skills from Video via Approximate Differentiable Physics

19. Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

20. Feature Generation for Long-tail Classification

21. Airbert: In-domain Pretraining for Vision-and-Language Navigation

22. Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

23. Clustering based Contrastive Learning for Improving Face Representations

24. Deep Multimodal Feature Encoding for Video Ordering

25. Learning Interactions and Relationships between Movie Characters

26. The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries

27. Video Face Clustering with Unknown Number of Clusters

28. HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips

29. Self-Supervised Learning of Face Representations for Video Face Clustering

30. Visual Reasoning by Progressive Module Networks

31. MovieGraphs: Towards Understanding Human-Centric Situations from Videos

32. Situation Recognition with Graph Neural Networks

33. Relaxed Earth Mover's Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning

34. Recovering the Missing Link: Predicting Class-Attribute Associations for Unsupervised Zero-Shot Learning

37. MovieQA: Understanding Stories in Movies through Question-Answering

42. Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

48. Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

Catalog

Books, media, physical & digital resources