Back to Search Start Over

Comparative study of visual saliency maps in the problem of classification of architectural images with Deep CNNs

Authors :
Alejandro Álvaro Ramírez Acosta
Kamel Guissous
Abraham Montoya Obeso
Jenny Benois-Pineau
Mireya Saraí García Vázquez
Valerie Gouet-Brunet
Laboratoire Bordelais de Recherche en Informatique (LaBRI)
Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
Méthodes d'Analyses pour le Traitement d'Images et la Stéréorestitution (MATIS)
Laboratoire des Sciences et Technologies de l'Information Géographique (LaSTIG)
École nationale des sciences géographiques (ENSG)
Institut National de l'Information Géographique et Forestière [IGN] (IGN)-Institut National de l'Information Géographique et Forestière [IGN] (IGN)-École nationale des sciences géographiques (ENSG)
Institut National de l'Information Géographique et Forestière [IGN] (IGN)-Institut National de l'Information Géographique et Forestière [IGN] (IGN)
Source :
IEEE Eighth International Conference on Image Processing Theory, Tools and Applications, IPTA 2018, IEEE Eighth International Conference on Image Processing Theory, Tools and Applications, IPTA 2018, Nov 2018, Xi'an, China. pp.1-6, ⟨10.1109/IPTA.2018.8608125⟩, IPTA
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

Incorporating Human Visual System (HVS) models into building of classifiers has become an intensively researched field in visual content mining. In the variety of models of HVS we are interested in so-called visual saliency maps. Contrarily to scan-paths they model instantaneous attention assigning the degree of interestingness/saliency for humans to each pixel in the image plane. In various tasks of visual content understanding, these maps proved to be efficient stressing contribution of the areas of interest in image plane to classifiers models. In previous works saliency layers have been introduced in Deep CNNs, showing that they allow reducing training time getting similar accuracy and loss values in optimal models. In case of large image collections efficient building of saliency maps is based on predictive models of visual attention. They are generally bottom-up and are not adapted to specific visual tasks. Unless they are built for specific content, such as "urban images"-targeted saliency maps we also compare in this paper. In present research we propose a "bootstrap" strategy of building visual saliency maps for particular tasks of visual data mining. A small collection of images relevant to the visual understanding problem is annotated with gaze fixations. Then the propagation to a large training dataset is ensured and compared with the classical GBVS model and a recent method of saliency for urban image content. The classification results within Deep CNN framework are promising compared to the purely automatic visual saliency prediction.

Details

Language :
English
Database :
OpenAIRE
Journal :
IEEE Eighth International Conference on Image Processing Theory, Tools and Applications, IPTA 2018, IEEE Eighth International Conference on Image Processing Theory, Tools and Applications, IPTA 2018, Nov 2018, Xi'an, China. pp.1-6, ⟨10.1109/IPTA.2018.8608125⟩, IPTA
Accession number :
edsair.doi.dedup.....6cf5cbb52123083faf29dfcb86b0a62b
Full Text :
https://doi.org/10.1109/IPTA.2018.8608125⟩