1. A Simple and Effective Method for Injecting Word-level Information into Character-aware Neural Language Models
- Author
-
Hidetaka Kamigaito, Manabu Okumura, Yukun Feng, and Hiroya Takamura
- Subjects
Computer science ,business.industry ,Character (computing) ,Concatenation ,02 engineering and technology ,Machine learning ,computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,Simple (abstract algebra) ,Softmax function ,030221 ophthalmology & optometry ,0202 electrical engineering, electronic engineering, information engineering ,Effective method ,020201 artificial intelligence & image processing ,Artificial intelligence ,Language model ,business ,computer ,Word (computer architecture) - Abstract
We propose a simple and effective method to inject word-level information into character-aware neural language models. Unlike previous approaches which usually inject word-level information at the input of a long short-term memory (LSTM) network, we inject it into the softmax function. The resultant model can be seen as a combination of character-aware language model and simple word-level language model. Our injection method can also be used together with previous methods. Through the experiments on 14 typologically diverse languages, we empirically show that our injection method, when used together with the previous methods, works better than the previous methods, including a gating mechanism, averaging, and concatenation of word vectors. We also provide a comprehensive comparison of these injection methods.
- Published
- 2023
- Full Text
- View/download PDF