8 results on '"Ali Borji"'
Search Results
2. An exocentric look at egocentric actions and vice versa
- Author
-
Shervin Ardeshir and Ali Borji
- Subjects
Endocentric and exocentric ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Class (biology) ,Domain (software engineering) ,Task (project management) ,Third person ,Action (philosophy) ,Human–computer interaction ,020204 information systems ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Versa ,Software ,Video retrieval - Abstract
In this work we address the task of relating action information across two drastically different visual domains, namely, first-person (egocentric) and third-person (exocentric). We investigate two different yet highly interconnected problems including cross-view action classification and action based video retrieval. First, we perform action classification in one domain using the knowledge transferred from the other domain. Second, given a video in one view, we retrieve videos from the same action class in the other view. In order to evaluate our models, we collect a new cross-domain dataset of egocentric-exocentric action videos containing 14 action classes and 3569 videos (1676 collected egocentric videos and 1893 exocentric videos borrowed from the UCF 101 dataset). Our results demonstrate the possibility of transferring action information across the two domains and suggest new directions in relating first and third person vision for other tasks.
- Published
- 2018
- Full Text
- View/download PDF
3. Negative results in computer vision: A perspective
- Author
-
Ali Borji
- Subjects
FOS: Computer and information sciences ,Value (ethics) ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Perspective (graphical) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Outcome (game theory) ,03 medical and health sciences ,0302 clinical medicine ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Psychology ,business ,030217 neurology & neurosurgery ,Cognitive vision ,Statistical hypothesis testing - Abstract
A negative result is when the outcome of an experiment or a model is not what is expected or when a hypothesis does not hold. Despite being often overlooked in the scientific community, negative results are results and they carry value. While this topic has been extensively discussed in other fields such as social sciences and biosciences, less attention has been paid to it in the computer vision community. The unique characteristics of computer vision, particularly its experimental aspect, call for a special treatment of this matter. In this manuscript, I will address what makes negative results important, how they should be disseminated and incentivized, and what lessons can be learned from cognitive vision research in this regard. Further, I will discuss matters such as experimental design, statistical hypothesis testing, explanatory versus predictive modeling, performance evaluation, model comparison, reproducibility of findings, the confluence of computer vision and human vision, as well as computer vision research culture.
- Published
- 2018
- Full Text
- View/download PDF
4. Augmented saliency model using automatic 3D head pose detection and learned gaze following in natural scenes
- Author
-
Ali Borji, Daniel Parks, and Laurent Itti
- Subjects
Male ,Eye Movements ,Head (linguistics) ,Computer science ,Posture ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Fixation, Ocular ,Oracle ,Imaging, Three-Dimensional ,Humans ,Natural (music) ,Computer vision ,Probability ,Structure (mathematical logic) ,Communication ,Markov chain ,business.industry ,Eye movement ,Gaze ,Sensory Systems ,Weighting ,Ophthalmology ,Pattern Recognition, Visual ,Female ,Artificial intelligence ,business ,Head - Abstract
Previous studies have shown that gaze direction of actors in a scene influences eye movements of passive observers during free-viewing (Castelhano, Wieth, & Henderson, 2007; Borji, Parks, & Itti, 2014). However, no computational model has been proposed to combine bottom-up saliency with actor’s head pose and gaze direction for predicting where observers look. Here, we first learn probability maps that predict fixations leaving head regions (gaze following fixations), as well as fixations on head regions (head fixations), both dependent on the actor’s head size and pose angle. We then learn a combination of gaze following, head region, and bottom-up saliency maps with a Markov chain composed of head region and non-head region states. This simple structure allows us to inspect the model and make comments about the nature of eye movements originating from heads as opposed to other regions. Here, we assume perfect knowledge of actor head pose direction (from an oracle). The combined model, which we call the Dynamic Weighting of Cues model (DWOC), explains observers’ fixations significantly better than each of the constituent components. Finally, in a fully automatic combined model, we replace the oracle head pose direction data with detections from a computer vision model of head pose. Using these (imperfect) automated detections, we again find that the combined model significantly outperforms its individual components. Our work extends the engineering and scientific applications of saliency models and helps better understand mechanisms of visual attention.
- Published
- 2015
- Full Text
- View/download PDF
5. What do eyes reveal about the mind?
- Author
-
Ali Borji, Andreas Lennartz, and Marc Pomplun
- Subjects
Visual search ,Computer science ,business.industry ,Cognitive Neuroscience ,Eye movement ,Wearable computer ,Machine learning ,computer.software_genre ,Gaze ,Computer Science Applications ,Gaze-contingency paradigm ,Artificial Intelligence ,Fixation (visual) ,Visual attention ,Artificial intelligence ,business ,computer - Abstract
We address the question of inferring the search target from fixation behavior in visual search. Such inference is possible since during search, our attention and gaze are guided toward visual features similar to those in the search target. We strive to answer two fundamental questions: what are the most powerful algorithmic principles for this task, and how does their performance depend on the amount of available eye movement data and the complexity of the target objects? In the first two experiments, we choose a random-dot search paradigm to eliminate contextual influences on search. We present an algorithm that correctly infers the target pattern up to 50 times as often as a previously employed method and promises sufficient power and robustness for interface control. Moreover, the current data suggest a principal limitation of target inference that is crucial for interface design: if the target pattern exceeds a certain spatial complexity level, only a subpattern tends to guide the observers' eye movements, which drastically impairs target inference. In the third experiment, we show that it is possible to predict search targets in natural scenes using pattern classifiers and classic computer vision features significantly above chance. The availability of compelling inferential algorithms could initiate a new generation of smart, gaze-controlled interfaces and wearable visual technologies that deduce from their users' eye movements the visual information for which they are looking. In a broader perspective, our study shows directions for efficient intent decoding from eye movements. HighlightsProviding a unified theoretical framework for intent decoding using eye movements.Proposing two new algorithms for search target inference from fixations.Studying the impact of target complexity in search performance and target inference.Sharing a large collection of code and data to promote future research in this area.
- Published
- 2015
- Full Text
- View/download PDF
6. What stands out in a scene? A study of human explicit saliency judgment
- Author
-
Dicky N. Sihite, Laurent Itti, and Ali Borji
- Subjects
Adult ,Male ,Adolescent ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Fixation, Ocular ,Models, Biological ,Young Adult ,Psychophysics ,Humans ,Attention ,Computer vision ,Bottom-up saliency ,Object-based attention ,Analysis of Variance ,Computational model ,Communication ,business.industry ,Explicit saliency judgment ,Eye movement ,Fixation (psychology) ,Gaze ,Sensory Systems ,Eye movements ,Ophthalmology ,Space-based attention ,Outlier ,Visual Perception ,Eye tracking ,Female ,Artificial intelligence ,business ,Free viewing ,Photic Stimulation ,De facto standard - Abstract
Eye tracking has become the de facto standard measure of visual attention in tasks that range from free viewing to complex daily activities. In particular, saliency models are often evaluated by their ability to predict human gaze patterns. However, fixations are not only influenced by bottom-up saliency (computed by the models), but also by many top-down factors. Thus, comparing bottom-up saliency maps to eye fixations is challenging and has required that one tries to minimize top-down influences, for example by focusing on early fixations on a stimulus. Here we propose two complementary procedures to evaluate visual saliency. We seek whether humans have explicit and conscious access to the saliency computations believed to contribute to guiding attention and eye movements. In the first experiment, 70 observers were asked to choose which object stands out the most based on its low-level features in 100 images each containing only two objects. Using several state-of-the-art bottom-up visual saliency models that measure local and global spatial image outliers, we show that maximum saliency inside the selected object is significantly higher than inside the non-selected object and the background. Thus spatial outliers are a predictor of human judgments. Performance of this predictor is boosted by including object size as an additional feature. In the second experiment, observers were asked to draw a polygon circumscribing the most salient object in cluttered scenes. For each of 120 images, we show that a map built from annotations of 70 observers explains eye fixations of another 20 observers freely viewing the images, significantly above chance (dataset by Bruce and Tsotsos (2009); shuffled AUC score 0.62 ± 0.07, chance 0.50, t-test p < 0.05). We conclude that fixations agree with saliency judgments, and classic bottom-up saliency models explain both. We further find that computational models specifically designed for fixation prediction slightly outperform models designed for salient object detection over both types of data (i.e., fixations and objects). Published by Elsevier Ltd.
- Published
- 2013
- Full Text
- View/download PDF
7. Online learning of task-driven object-based visual attention control
- Author
-
Majid Nili Ahmadabadi, Babak Nadjar Araabi, Ali Borji, and Mandana Hamidi
- Subjects
business.industry ,Computer science ,Cognitive neuroscience of visual object recognition ,Object (computer science) ,Action selection ,Visual processing ,Method ,Signal Processing ,Biased Competition Theory ,Object model ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Object-based attention - Abstract
We propose a biologically-motivated computational model for learning task-driven and object-based visual attention control in interactive environments. In this model, top-down attention is learned interactively and is used to search for a desired object in the scene through biasing the bottom-up attention in order to form a need-based and object-driven state representation of the environment. Our model consists of three layers. First, in the early visual processing layer, most salient location of a scene is derived using the biased saliency-based bottom-up model of visual attention. Then a cognitive component in the higher visual processing layer performs an application specific operation like object recognition at the focus of attention. From this information, a state is derived in the decision making and learning layer. Top-down attention is learned by the U-TREE algorithm which successively grows an object-based binary tree. Internal nodes in this tree check the existence of a specific object in the scene by biasing the early vision and the object recognition parts. Its leaves point to states in the action value table. Motor actions are associated with the leaves. After performing a motor action, the agent receives a reinforcement signal from the critic. This signal is alternately used for modifying the tree or updating the action selection policy. The proposed model is evaluated on visual navigation tasks, where obtained results lend support to the applicability and usefulness of the developed method for robotics.
- Published
- 2010
- Full Text
- View/download PDF
8. Special issue on recent advances in saliency models, applications and evaluations
- Author
-
Ali Borji, Zhi Liu, Hongliang Li, and Olivier Le Meur
- Subjects
Computer science ,business.industry ,Signal Processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,Machine learning ,computer.software_genre ,business ,computer ,Software - Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.