7 results on '"Assif, Liav"'
Search Results
2. Atoms of recognition in human and computer vision
- Author
-
Ullman, Shimon, Assif, Liav, Fetaya, Ethan, and Harari, Daniel
- Published
- 2016
3. Human-like scene interpretation by a guided counterstream processing.
- Author
-
Ullman, Shimon, Assif, Liav, Strugatski, Alona, Vatashsky, Ben-Zion, Levi, Hila, Netanyahu, Aviv, and Yaari, Adam
- Subjects
- *
VISUAL perception , *NEURAL circuitry , *MACHINE learning , *GOAL (Psychology) , *ARTIFICIAL intelligence - Abstract
In modeling vision, there has been a remarkable progress in recognizing a range of scene components, but the problem of analyzing full scenes, an ultimate goal of visual perception, is still largely open. To deal with complete scenes, recent work focused on the training of models for extracting the full graph-like structure of a scene. In contrast with scene graphs, humans' scene perception focuses on selected structures in the scene, starting with a limited interpretation and evolving sequentially in a goal-directed manner [G. L. Malcolm, I. I. A. Groen, C. I. Baker, Trends. Cogn. Sci. 20, 843-856 (2016)]. Guidance is crucial throughout scene interpretation since the extraction of full scene representation is often infeasible. Here, we present a model that performs human-like guided scene interpretation, using an iterative bottom-up, top-down processing, in a "counterstream" structure motivated by cortical circuitry. The process proceeds by the sequential application of top-down instructions that guide the interpretation process. The results show how scene structures of interest to the viewer are extracted by an automatically selected sequence of top-down instructions. The model shows two further benefits. One is an inherent capability to deal well with the problem of combinatorial generalization--generalizing broadly to unseen scene configurations, which is limited in current network models [B. Lake, M. Baroni, 35th International Conference on Machine Learning, ICML 2018 (2018)]. The second is the ability to combine visual with nonvisual information at each cycle of the interpretation process, which is a key aspect for modeling human perception as well as advancing AI vision systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. When standard RANSAC is not enough: cross-media visual matching with hypothesis relevancy
- Author
-
Hassner, Tal, Assif, Liav, and Wolf, Lior
- Published
- 2014
- Full Text
- View/download PDF
5. Image interpretation by iterative bottom-up top-down processing
- Author
-
Ullman, Shimon, Assif, Liav, Strugatski, Alona, Vatashsky, Ben-Zion, Levy, Hila, Netanyahu, Aviv, and Yaari, Adam
- Subjects
FOS: Computer and information sciences ,combinatorial generalization ,Quantitative Biology - Neurons and Cognition ,Computer Vision and Pattern Recognition (cs.CV) ,FOS: Biological sciences ,Neurons and Cognition ,Computer Science - Computer Vision and Pattern Recognition ,top-down processing ,Neurons and Cognition (q-bio.NC) ,scene understanding ,Computer Vision and Pattern Recognition ,Scene perception ,guided vision - Abstract
Scene understanding requires the extraction and representation of scene components together with their properties and inter-relations. We describe a model in which meaningful scene structures are extracted from the image by an iterative process, combining bottom-up (BU) and top-down (TD) networks, interacting through a symmetric bi-directional communication between them (counter-streams structure). The model constructs a scene representation by the iterative use of three components. The first model component is a BU stream that extracts selected scene elements, properties and relations. The second component (cognitive augmentation) augments the extracted visual representation based on relevant non-visual stored representations. It also provides input to the third component, the TD stream, in the form of a TD instruction, instructing the model what task to perform next. The TD stream then guides the BU visual stream to perform the selected task in the next cycle. During this process, the visual representations extracted from the image can be combined with relevant non-visual representations, so that the final scene representation is based on both visual information extracted from the scene and relevant stored knowledge of the world. We describe how a sequence of TD-instructions is used to extract from the scene structures of interest, including an algorithm to automatically select the next TD-instruction in the sequence. The extraction process is shown to have favorable properties in terms of combinatorial generalization, generalizing well to novel scene structures and new combinations of objects, properties and relations not seen during training. Finally, we compare the model with relevant aspects of the human vision, and suggest directions for using the BU-TD scheme for integrating visual and cognitive components in the process of scene understanding.
- Published
- 2020
- Full Text
- View/download PDF
6. Structured learning and detailed interpretation of minimal object images
- Author
-
Ben-Yosef, Guy, Assif, Liav, and Ullman, Shimon
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We model the process of human full interpretation of object images, namely the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. We model interpretation by a structured learning framework, in which there are primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of minimal configurations, namely reduced local regions that are minimal in the sense that further reduction will turn them unrecognizable and uninterpretable. We show experimental results of our model, and results of predicting and testing relations that were useful to the model via transformed minimal images., Accepted to Workshop on Mutual Benefits of Cognitive and Computer Vision, at the International Conference on Computer Vision. Venice, Italy, 2017
- Published
- 2017
7. Visual categorization of social interactions.
- Author
-
de la Rosa, Stephan, Choudhery, Rabia N., Curio, Cristóbal, Ullman, Shimon, Assif, Liav, and Bülthoff, Heinrich H.
- Subjects
COMPUTER science ,APPLIED mathematics ,USER-centered system design ,MEMORY ,SOCIAL integration - Abstract
Prominent theories of action recognition suggest that during the recognition of actions the physical patterns of the action is associated with only one action interpretation (e.g., a person waving his arm is recognized as waving). In contrast to this view, studies examining the visual categorization of objects show that objects are recognized in multiple ways (e.g., a VW Beetle can be recognized as a car or a beetle) and that categorization performance is based on the visual and motor movement similarity between objects. Here, we studied whether we find evidence for multiple levels of categorization for social interactions (physical interactions with another person, e.g., handshakes). To do so, we compared visual categorization of objects and social interactions (Experiments 1 and 2) in a grouping task and assessed the usefulness of motor and visual cues (Experiments 3, 4, and 5) for object and social interaction categorization. Additionally, we measured recognition performance associated with recognizing objects and social interactions at different categorization levels (Experiment 6). We found that basic level object categories were associated with a clear recognition advantage compared to subordinate recognition but basic level social interaction categories provided only a little recognition advantage. Moreover, basic level object categories were more strongly associated with similar visual and motor cues than basic level social interaction categories. The results suggest that cognitive categories underlying the recognition of objects and social interactions are associated with different performances. These results are in line with the idea that the same action can be associated with several action interpretations (e.g., a person waving his arm can be recognized as waving or greeting). [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.