1. Subtraction Gates: Another Way to Learn Long-Term Dependencies in Recurrent Neural Networks.
- Author
-
He, Tao, Mao, Hua, and Yi, Zhang
- Subjects
- *
RECURRENT neural networks , *JACOBIAN matrices - Abstract
Recurrent neural networks (RNNs) can remember temporal contextual information over various time steps. The well-known gradient vanishing/explosion problem restricts the ability of RNNs to learn long-term dependencies. The gate mechanism is a well-developed method for learning long-term dependencies in long short-term memory (LSTM) models and their variants. These models usually take the multiplication terms as gates to control the input and output of RNNs during forwarding computation and to ensure a constant error flow during training. In this article, we propose the use of subtraction terms as another type of gates to learn long-term dependencies. Specifically, the multiplication gates are replaced by subtraction gates, and the activations of RNNs input and output are directly controlled by subtracting the subtrahend terms. The error flows remain constant, as the linear identity connection is retained during training. The proposed subtraction gates have more flexible options of internal activation functions than the multiplication gates of LSTM. The experimental results using the proposed Subtraction RNN (SRNN) indicate comparable performances to LSTM and gated recurrent unit in the Embedded Reber Grammar, Penn Tree Bank, and Pixel-by-Pixel MNIST experiments. To achieve these results, the SRNN requires approximate three-quarters of the parameters used by LSTM. We also show that a hybrid model combining multiplication forget gates and subtraction gates could achieve good performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF