Your History

L2 Regularization

Description

L2 (Ridge) regularization is a popular form of regularizer which computes the squared sum of the model parameters, promoting the optimization procedure to find models with smaller absolute parameters. It is typically used as an extra term in the loss function, to reduce the magnitude of the weights of the model. This can work to prevent overfitting.

\[\htmlClass{sdt-0000000076}{\textup{reg}}(\htmlClass{sdt-0000000066}{\theta}) = \sum_{\htmlClass{sdt-0000000059}{\mathbf{W}} \in \htmlClass{sdt-0000000066}{\theta}} \htmlClass{sdt-0000000059}{\mathbf{W}}^2 = \Vert \htmlClass{sdt-0000000066}{\theta} \Vert^2\]

Symbols Used:

\( \theta \)	This is the symbol we use for model weights/parameters.
\( \mathbf{W} \)	This symbol represents the matrix containing the weights and biases of a layer in a neural network.
\( \textup{reg} \)	This is the symbol used for representing a regularization function.

References

Jaeger, H. (n.d.). Neural Networks (AI) (WBAI028-05) Lecture Notes BSc program in Artificial Intelligence. Retrieved April 24, 2024, from https://www.ai.rug.nl/minds/uploads/LN_NN_RUG.pdf