This equation calculates the energy of a state in a Hopfield Network. It originates from statistical thermodynamics and is used for learning patterns.
\( j \) | This is a secondary symbol for an iterator, a variable that changes value to refer to a series of elements |
\( i \) | This is the symbol for an iterator, a variable that changes value to refer to a sequence of elements. |
\( L \) | This symbol represents the sizes of the layers in a neural network. |
\( \mathbf{x} \) | This symbol represents a state of the dynamical system at some time point. |
\( \mathbf{W} \) | This symbol represents the matrix containing the weights and biases of a layer in a neural network. |
\( \sum \) | This is the summation symbol in mathematics, it represents the sum of a sequence of numbers. |
\( E \) | This symbol represents the energy. |
Remember that the state in the Hopfield Network consists only of -1s and 1s: \(\htmlClass{sdt-0000000046}{\mathbf{x}} \in \{-1,1\}^{\htmlClass{sdt-0000000044}{L}}\). Intuitively, Hopfield Networks learns the correspondence between two neurons. If a state is equal to one of the stored patterns, we want the energy to be minimal. This implies that if \(\htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000018}{i}}\) and \(\htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000011}{j}}\) are the same, their weight, \(\htmlClass{sdt-0000000059}{\mathbf{W}}_{\htmlClass{sdt-0000000018}{i} \htmlClass{sdt-0000000011}{j}}\), should be high.
By calculating these "correspondence scores" for every pair of neurons and summing them together, we should get a very large value is the state resembles one of the stored patterns. Because of the negative sign before the summation, the energy in this situation is very small.
Let the weight matrix be:
\[ \htmlClass{sdt-0000000059}{\mathbf{W}} = \begin{bmatrix} 0.5 & 0.4 \\ 0.3 & 0.2 \end{bmatrix} \]
and the state be:
\[ \htmlClass{sdt-0000000046}{\mathbf{x}} = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]
We can easily calculate the energy of that state:
\[ \htmlClass{sdt-0000000100}{E}(\htmlClass{sdt-0000000046}{\mathbf{x}}) = -\frac{1}{2}\htmlClass{sdt-0000000046}{\mathbf{x}}^{\htmlClass{sdt-0000000022}{T}} \htmlClass{sdt-0000000059}{\mathbf{W}} \htmlClass{sdt-0000000046}{\mathbf{x}} \]
\[ \htmlClass{sdt-0000000100}{E}(\htmlClass{sdt-0000000046}{\mathbf{x}}) = -\frac{1}{2} \begin{bmatrix} 1 & -1 \end{bmatrix} \begin{bmatrix} 0.5 & 0.4 \\ 0.3 & 0.2 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]
\[ \htmlClass{sdt-0000000100}{E}(\htmlClass{sdt-0000000046}{\mathbf{x}}) = -\frac{1}{2} \begin{bmatrix} 0.2 & 0.2 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} \]
\[ \htmlClass{sdt-0000000100}{E}(\htmlClass{sdt-0000000046}{\mathbf{x}}) = -\frac{1}{2} \cdot 0 = 0 \]