Back to Search Start Over

Multimodal video classification with stacked contractive autoencoders.

Authors :
Liu, Yanan
Feng, Xiaoqing
Zhou, Zhiguang
Source :
Signal Processing. Mar2016, Vol. 120, p761-766. 6p.
Publication Year :
2016

Abstract

In this paper we propose a multimodal feature learning mechanism based on deep networks (i.e., stacked contractive autoencoders) for video classification. Considering the three modalities in video, i.e., image, audio and text, we first build one Stacked Contractive Autoencoder (SCAE) for each single modality, whose outputs will be joint together and fed into another Multimodal Stacked Contractive Autoencoder (MSCAE). The first stage preserves intra-modality semantic relations and the second stage discovers inter-modality semantic correlations. Experiments on real world dataset demonstrate that the proposed approach achieves better performance compared with the state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01651684
Volume :
120
Database :
Academic Search Index
Journal :
Signal Processing
Publication Type :
Academic Journal
Accession number :
111496596
Full Text :
https://doi.org/10.1016/j.sigpro.2015.01.001