In order to make neural networks non-linear functions, we need to apply transformations that are more advanced than matrix multiplications. To do this, we use a non-linear, bijective function, called an activation function. We typically denote it as \( \htmlClass{sdt-0000000051}{\sigma} \).
The function can operate on scalars or on vectors. If it operates on a vector, it works element-wise, applying the same function to each entry of the vector.
\( n \) | This symbol represents any given whole number, \( n \in \htmlClass{sdt-0000000014}{\mathbb{W}}\). |
\( \sigma \) | This symbol represents the activation function. It maps real values to other real values in a non-linear way. |
\( \mathbb{R} \) | This is the symbol for the set of real numbers. |