Heteroassociative Hopfield networks, also referred to as autoassociative Hopfield networks. They are designed to associate different sets of patterns, rather than store and retrieve the same pattern. It is done using a slightly modified hebbian learning rule, which is the subject of this equation.
In this particular example, we consider a case where we find a circular sequence of training patterns.
\( i \) | This is the symbol for an iterator, a variable that changes value to refer to a sequence of elements. |
\( T \) | This symbol represents the transpose of a matrix or vector. Taking the transpose of a matrix or vector essentially means flipping the matrix or vector over its diagonal. |
\( \mathbf{W} \) | This symbol represents the matrix containing the weights and biases of a layer in a neural network. |
\( \mu \) | This is the symbol representing the learning rate. |
\( \xi \) | This symbol represents a single training pattern. |
\( n \) | This symbol represents any given whole number, \( n \in \htmlClass{sdt-0000000014}{\mathbb{W}}\). |
Let us begin by considering how we update the weights of a conventional hopfield network:
\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} - 1) + \htmlClass{sdt-0000000106}{\mu} (\htmlClass{sdt-0000000111}{\xi}\htmlClass{sdt-0000000111}{\xi}^{\htmlClass{sdt-0000000022}{T}} - \htmlClass{sdt-0000000089}{I})\]
As described in the description of this equation, we are doing this for a case where we have a circular sequence of training patterns, so...
\[\htmlClass{sdt-0000000111}{\xi}^{(1)},\:\:\dots,\:\:\htmlClass{sdt-0000000111}{\xi}^{ (\htmlClass{sdt-0000000007}{N})} = \htmlClass{sdt-0000000111}{\xi}^{(1)}\]
Because we are now looking at a heteroassociative hopfield network, patterns are no longer purely self-associative, but also have relations between each other. For this reason, we change our training pattern to now be between consecutive patterns.
The purpose of the identity matrix is to zero out the main diagonal to avoid self-correspondences. However, as we are now multiplying different training patterns, we no longer need it.
By substituting in \(\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i}+1)}\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i})\htmlClass{sdt-0000000022}{T}}\) for \(\htmlClass{sdt-0000000111}{\xi} \htmlClass{sdt-0000000111}{\xi}^{\htmlClass{sdt-0000000022}{T}}\) and ignoring the now redundant identity matrix (\( \htmlClass{sdt-0000000089}{I} \)) we get:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} - 1) + \htmlClass{sdt-0000000106}{\mu}\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i} + 1)}\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i})\htmlClass{sdt-0000000022}{T}} \]
as required
Consider a starting weight matrix:
\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 & -0.6\\-0.3 & 0\end{bmatrix}\]
with a learning rate of:
\[\htmlClass{sdt-0000000106}{\mu} = 0.1\]
And training patterns of interest:
\[\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i})} = \begin{bmatrix}1 & -1\end{bmatrix}\]
\[\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i}+1)} = \begin{bmatrix}-1 & 1\end{bmatrix}\]
We can now plug these in to our equation:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n} - 1) + \htmlClass{sdt-0000000106}{\mu}\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i} + 1)}\htmlClass{sdt-0000000111}{\xi}^{(\htmlClass{sdt-0000000018}{i})\htmlClass{sdt-0000000022}{T}} \]
to get:
\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 & -0.6\\-0.3 & 0\end{bmatrix} + 0.1 \cdot \begin{bmatrix} -1 & 1 \end{bmatrix} \times \begin{bmatrix}1 &-1\end{bmatrix}^{\htmlClass{sdt-0000000022}{T}}\]
which simplifies to...
\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 & -0.6\\-0.3 & 0\end{bmatrix} + 0.1 \cdot \begin{bmatrix} -1 & 1 \end{bmatrix} \times \begin{bmatrix}1 \\-1\end{bmatrix}\]
which simplifies further to...
(note that the operation between the operations is an "outer product" (also called "cross product").
\[\htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 & -0.6\\-0.3 & 0\end{bmatrix} + 0.1 \cdot \begin{bmatrix} -1 & 1 \\ 1 & - 1\end{bmatrix}\]
We therefore are left with the addition of the matrices:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 & -0.6\\-0.3 & 0\end{bmatrix} + \begin{bmatrix} -0.1 & 0.1 \\ 0.1 & -0.1\end{bmatrix} \]
Which can simply be calculated element wise as:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}0 - 0.1 & -0.6 + 0.1\\-0.3 + 0.1 & 0 -0.1\end{bmatrix} \]
which yields the answer:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}}(\htmlClass{sdt-0000000117}{n}) = \begin{bmatrix}-0.1 & -0.5\\-0.2 &-0.1\end{bmatrix} \]