Back to Search
Start Over
Abstract Visual Reasoning Enabled by Language
- Publication Year :
- 2023
-
Abstract
- While artificial intelligence (AI) models have achieved human or even superhuman performance in many well-defined applications, they still struggle to show signs of broad and flexible intelligence. The Abstraction and Reasoning Corpus (ARC), a visual intelligence benchmark introduced by Fran\c{c}ois Chollet, aims to assess how close AI systems are to human-like cognitive abilities. Most current approaches rely on carefully handcrafted domain-specific program searches to brute-force solutions for the tasks present in ARC. In this work, we propose a general learning-based framework for solving ARC. It is centered on transforming tasks from the vision to the language domain. This composition of language and vision allows for pre-trained models to be leveraged at each stage, enabling a shift from handcrafted priors towards the learned priors of the models. While not yet beating state-of-the-art models on ARC, we demonstrate the potential of our approach, for instance, by solving some ARC tasks that have not been solved previously.<br />Comment: The first two authors have contributed equally to this work. Accepted as regular paper at CVPR 2023 Workshop and Challenges for New Frontiers in Visual Language Reasoning: Compositionality, Prompts and Causality (NFVLR)
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2303.04091
- Document Type :
- Working Paper