多模态深度学习综述.

Authors :: 刘建伟
 丁熙浩
 罗雄麟
Source :: Application Research of Computers / Jisuanji Yingyong Yanjiu. Jun2020, Vol. 37 Issue 6, p1601-1614. 14p.
Publication Year :: 2020
Abstract: This paper aimed to summarize the current multimodal deep learning, found common problems in the implementation of multimodal deep learning under different multimodal and learning objectives, as well as made common problems classify and described methods for solving various problems at the early development of multimodal deep learning . Specifically, this paper summarized the current multimodal deep learning that studied on natural language, visual, auditory, and considered the re search direction such as language translation, event detection, information description, emotion recognition, voice recognition and synthesis,and multimedia retrieval and so on, which further concluded that there were four types of common problems :multimodal r epresentation, multimodal interpretation, multimodal fusion and multimodal alignment. Meanwhile, this paper sub-categorized and discussed each common multimodal learning problem, and listed the neural network models generated for solving the problems. Finally, it introduced some actual multimodal system, listed baseline datasets and evaluation criteria used in multimodal deep learning, and prospected the development directions for future research [ABSTRACT FROM AUTHOR]