Search

Your search keyword '"Piergiovanni, AJ"' showing total 29 results

Search Constraints

Start Over You searched for: Author "Piergiovanni, AJ" Remove constraint Author: "Piergiovanni, AJ" Database OpenAIRE Remove constraint Database: OpenAIRE
29 results on '"Piergiovanni, AJ"'

Search Results

1. Joint Adaptive Representations for Image-Language Learning

2. PaLI-X: On Scaling up a Multilingual Vision and Language Model

3. MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

4. Diversifying Joint Vision-Language Tokenization Learning

5. Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

6. Compound Tokens: Channel Fusion for Vision-Language Representation Learning

7. PaLI: A Jointly-Scaled Multilingual Language-Image Model

8. Pre-training image-language transformers for open-vocabulary tasks

9. Video Question Answering with Iterative Video-Text Co-Tokenization

10. Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

11. FindIt: Generalized Localization with Natural Language Queries

12. F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

13. 4D-Net for Learned Multi-Modal Alignment

14. TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

15. Recognizing Actions in Videos from Unseen Viewpoints

16. Unsupervised Action Segmentation for Instructional Videos

17. AViD Dataset: Anonymized Videos from Diverse Countries

18. AssembleNet++: Assembling Modality Representations via Attention Connections

19. Adversarial Generative Grammars for Human Activity Prediction

20. Evolving Losses for Unlabeled Video Representation Learning

21. Early Detection of Injuries in MLB Pitchers from Video

22. AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

23. Model-based Behavioral Cloning with Future Image Similarity Learning

24. Representation Flow for Action Recognition

25. Fine-grained Activity Recognition in Baseball Videos

26. Temporal Gaussian Mixture Layer for Videos

27. Learning Multimodal Representations for Unseen Activities

28. Learning Latent Super-Events to Detect Multiple Activities in Videos

29. Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters

Catalog

Books, media, physical & digital resources