Your History

Menu

Weight update of a Hopfield Network

Prerequisites

Training pattern | \( \xi \)

Description

This equation updates the weight matrix of a Hopfield Network. It takes into consideration two learning patterns changes the weights to represent the correspondences between them.

\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} - 1) + \htmlClass{sdt-0000000106}{\mu} (\htmlClass{sdt-0000000111}{\xi}\htmlClass{sdt-0000000111}{\xi}^{\htmlClass{sdt-0000000022}{T}} - \htmlClass{sdt-0000000089}{I})\]

Symbols Used:

This symbol represents the transpose of a matrix or vector. Taking the transpose of a matrix or vector essentially means flipping the matrix or vector over its diagonal.

\( \mathbf{W} \)

This symbol represents the matrix containing the weights and biases of a layer in a neural network.

\( I \)

This is the symbol for the identity matrix. It behaves such that, for some matrix \(A\), \(AI = A\) and \(IA = A\)

\( \mu \)

This is the symbol representing the learning rate.

\( \xi \)

This symbol represents a single training pattern.

\( n \)

This symbol represents any given whole number, \( n \in \htmlClass{sdt-0000000014}{\mathbb{W}}\).

Derivation

Consider the definition of a training pattern in a hopfield network:

This symbol, \(\xi\), represents a single training pattern, typically used for training a Hopfield Network. It is a vector of 1s and -1s: \(\xi \in \{1,-1\}^L\)


Pay attention to the fact that \(\htmlClass{sdt-0000000111}{\xi} \in \{1,-1\}^{\htmlClass{sdt-0000000044}{L}}\). The dot product of the pattern with its transpose results in a matrix of 1s and -1s, where 1 at \(\htmlClass{sdt-0000000018}{i} ,\htmlClass{sdt-0000000011}{j}\) position indicates that \(\htmlClass{sdt-0000000111}{\xi}_{\htmlClass{sdt-0000000018}{i}} = \htmlClass{sdt-0000000111}{\xi}_{\htmlClass{sdt-0000000011}{j}}\) and -1 at \(\htmlClass{sdt-0000000018}{i},\htmlClass{sdt-0000000011}{j}\) position indicates that \(\htmlClass{sdt-0000000111}{\xi}_{\htmlClass{sdt-0000000018}{i}} \not = \htmlClass{sdt-0000000111}{\xi}_{\htmlClass{sdt-0000000011}{j}}\).

By subtracting an identity matrix, \(\htmlClass{sdt-0000000089}{I}\) from the \(\htmlClass{sdt-0000000111}{\xi} \htmlClass{sdt-0000000111}{\xi}^{\htmlClass{sdt-0000000022}{T}}\), we zero the entries on the diagonal, ignoring the self-correspondences.

Our goal is to iteratively train a weight matrix, \( \htmlClass{sdt-0000000059}{\mathbf{W}} \), that learns the correspondences between values within a training pattern. We want to shift the weights in a direction, that amplifies these effects. For a positive correspondence, the weights increase by a \(\htmlClass{sdt-0000000106}{\mu}\). For a negative correspondence, the weights decrease by a \(\htmlClass{sdt-0000000106}{\mu}\).

After training, high weight \(\htmlClass{sdt-0000000059}{\mathbf{W}}_{\htmlClass{sdt-0000000018}{i} \htmlClass{sdt-0000000011}{j}}\) indicates that neurons \(\htmlClass{sdt-0000000018}{i}\) and \(\htmlClass{sdt-0000000011}{j}\) are often the same.

Example

Let's set the original weight matrix to

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix} 0 & -0.6 \\ -0.3 & 0 \end{bmatrix} \]

the learning rate is

\[ \htmlClass{sdt-0000000106}{\mu} = 0.1 \]

We consider the pattern

\[ \xi = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \].

We can substitute these values and solve this step by step.

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) + \htmlClass{sdt-0000000106}{\mu} (\htmlClass{sdt-0000000111}{\xi}\htmlClass{sdt-0000000111}{\xi}^T - \htmlClass{sdt-0000000089}{I}) \]

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \begin{bmatrix} 0 & -0.6 \\ -0.3 & 0 \end{bmatrix} + 0.1 ( \begin{bmatrix} 1 \\ -1 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix}^T - \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}) \]

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \begin{bmatrix} 0 & -0.6 \\ -0.3 & 0 \end{bmatrix} + 0.1 ( \begin{bmatrix} 1 \\ -1 \end{bmatrix} \begin{bmatrix} 1 & -1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}) \]

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \begin{bmatrix} 0 & -0.6 \\ -0.3 & 0 \end{bmatrix} + 0.1 ( \begin{bmatrix} 1 & -1\\ -1 & 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}) \]

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \begin{bmatrix} 0 & -0.6 \\ -0.3 & 0 \end{bmatrix} + \begin{bmatrix} 0 & -0.1\\ -0.1 & 0 \end{bmatrix} \]

\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} + 1) = \begin{bmatrix} 0 & -0.7 \\ -0.4 & 0 \end{bmatrix} \]

References

  1. Jaeger, H. (n.d.). Neural Networks (AI) (WBAI028-05) Lecture Notes BSc program in Artificial Intelligence. Retrieved April 27, 2024, from https://www.ai.rug.nl/minds/uploads/LN_NN_RUG.pdf