Back to Search
Start Over
Self-Supervised Audio-Visual Feature Learning for Single-Modal Incremental Terrain Type Clustering
- Source :
- IEEE Access, Vol 9, Pp 64346-64357 (2021)
- Publication Year :
- 2021
- Publisher :
- IEEE, 2021.
-
Abstract
- The key to an accurate understanding of terrain is to extract the informative features from the multi-modal data obtained from different devices. Sensors, such as RGB cameras, depth sensors, vibration sensors, and microphones, are used as the multi-modal data. Many studies have explored ways to use them, especially in the robotics field. Some papers have successfully introduced single-modal or multi-modal methods. However, in practice, robots can be faced with extreme conditions; microphones do not work well in crowded scenes, and an RGB camera cannot capture terrains well in the dark. In this paper, we present a novel framework using the multi-modal variational autoencoder and the Gaussian mixture model clustering algorithm on image data and audio data for terrain type clustering by forcing the features to be closer together in the feature space. Our method enables the terrain type clustering even if one of the modalities (either image or audio) is missing at the test-time. We evaluated the clustering accuracy with a conventional multi-modal terrain type clustering method and we conducted ablation studies to show the effectiveness of our approach.
- Subjects :
- General Computer Science
Computer science
Feature vector
Feature extraction
Terrain
02 engineering and technology
010501 environmental sciences
01 natural sciences
Self-supervised
0202 electrical engineering, electronic engineering, information engineering
General Materials Science
Computer vision
Cluster analysis
0105 earth and related environmental sciences
business.industry
General Engineering
Mixture model
Autoencoder
TK1-9971
RGB color model
020201 artificial intelligence & image processing
Artificial intelligence
terrain type clustering
Electrical engineering. Electronics. Nuclear engineering
multi-modal learning
business
Feature learning
Subjects
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 9
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....010e1d794a0cf0dbfef67b1c8c5952dc