Your History

Output gate of an LSTM

Prerequisites

Activation of a layer | \(x^\kappa = \sigma(\mathbf{W}[1;x^{k - 1}])\)

Description

This equation represents the output gate of an LSTM. It transforms an external signal, \(x^{\htmlClass{sdt-0000000047}{g^\text{output}}}\), consisting of outputs of other LSTM blocks and other neurons in the neural network similarly to a typical multi-layer perceptron.

\[\htmlClass{sdt-0000000047}{g^\text{output}}(\htmlClass{sdt-0000000117}{n}+1) = \htmlClass{sdt-0000000079}{\sigma}(\htmlClass{sdt-0000000059}{\mathbf{W}}^{\htmlClass{sdt-0000000047}{g^\text{output}}}[1;x^{\htmlClass{sdt-0000000047}{g^\text{output}}}])\]

Symbols Used:

\( g^\text{output} \)	This symbol represents the state of the output gate of the LSTM.
\( \mathbf{W} \)	This symbol represents the matrix containing the weights and biases of a layer in a neural network.
\( \sigma \)	This symbol represents the sigmoid function.
\( n \)	This symbol represents any given whole number, \( n \in \htmlClass{sdt-0000000014}{\mathbb{W}}\).

Derivation

Notice that the equation is analogous to the activation of a single layer \[\htmlClass{sdt-0000000050}{x^\kappa} = \htmlClass{sdt-0000000051}{\sigma}(\htmlClass{sdt-0000000059}{\mathbf{W}}[1;x^{\htmlClass{sdt-0000000015}{k} - 1}])\].

Derivation of this equation follows the same steps as the Activation of a layer, but the activation function is strictly sigmoid. No other activations can be used.

References

Jaeger, H. (n.d.). Neural Networks (AI) (WBAI028-05) Lecture Notes BSc program in Artificial Intelligence. Retrieved April 27, 2024, from https://www.ai.rug.nl/minds/uploads/LN_NN_RUG.pdf