Back to Search
Start Over
Building semantic understanding beyond deep learning from sound and vision
- Source :
- ICPR
- Publication Year :
- 2016
- Publisher :
- IEEE, 2016.
-
Abstract
- Deep learning-based models have recently been widely successful at outperforming traditional approaches in several computer vision applications such as image classification, object recognition and action recognition. However, those models are not naturally designed to learn structural information that can be important to tasks such as human pose estimation and structured semantic interpretation of video events. In this paper, we demonstrate how to build structured semantic understanding of audio-video events by reasoning on multiple-label decisions of deep visual models and auditory models using Grenander's structures for imposing semantic consistency. The proposed structured model does not require joint training of the structural semantic dependencies and deep models. Instead they are independent components linked by Grenander's structures. Furthermore, we exploited Grenander's structures as a means to facilitate and enrich the model with fusion of multimodal sensory data; in particular, auditory features with visual features. Overall, we observed improvements in the quality of semantic interpretations using deep models and auditory features in combination with Grenander's structures, reflecting as numerical improvements of up to 11.5% and 12.3% in precision and recall, respectively.
- Subjects :
- Contextual image classification
business.industry
Computer science
Semantic interpretation
Deep learning
Feature extraction
Cognitive neuroscience of visual object recognition
02 engineering and technology
010501 environmental sciences
computer.software_genre
Machine learning
Semantics
01 natural sciences
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
business
Precision and recall
computer
Pose
Natural language processing
0105 earth and related environmental sciences
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2016 23rd International Conference on Pattern Recognition (ICPR)
- Accession number :
- edsair.doi...........9b1ce49a0c81eef8c716e9b1ae45594d
- Full Text :
- https://doi.org/10.1109/icpr.2016.7899945