Approximation of Performance Landscape

Description

The performance landscape of \( \htmlClass{sdt-0000000062}{R} \)(\( \htmlClass{sdt-0000000083}{\theta} \)) can be approximated using Taylor approximation. It allows for estimating the model's risk in the neighborhood of parameters \( \htmlClass{sdt-0000000083}{\theta} \).

Symbols Used:

\( x \)	This is a symbol for any generic variable. It can hold any value, whether that be an integer or a real number, or a complex number, or a matrix etc.
\( R \)	This symbol denotes the risk of a model.
\( \sum \)	This is the summation symbol in mathematics, it represents the sum of a sequence of numbers.
\( \theta \)	This symbol represents the parameters of the model

Derivation

Consider the performance landscape of \( \htmlClass{sdt-0000000062}{R} \)(\( \htmlClass{sdt-0000000083}{\theta} \)).

Notice, that \( \htmlClass{sdt-0000000062}{R} \)(\( \htmlClass{sdt-0000000083}{\theta} \)) can be a highly complex function with high curvature. However, if we choose a single point \( \htmlClass{sdt-0000000083}{\theta} \) in the domain \( \htmlClass{sdt-0000000052}{\Theta} \), we can calculate the second derivatives of the risk: \[(\frac{\delta^2 \htmlClass{sdt-0000000062}{R}}{\delta \htmlClass{sdt-0000000083}{\theta}^2}),\] which can now be used to locally approximate the risk - \( \htmlClass{sdt-0000000062}{R} \). If we denote this approximation as \[\hat{\htmlClass{sdt-0000000062}{R}}(\htmlClass{sdt-0000000083}{\theta}),\] then the final shape of the performance landscape is approximately:

\[\hat{\htmlClass{sdt-0000000062}{R}}(\htmlClass{sdt-0000000083}{\theta}) = \htmlClass{sdt-0000000080}{\sum}_{i=1}^D(\frac{\delta^2 \htmlClass{sdt-0000000062}{R}}{\delta \htmlClass{sdt-0000000083}{\theta}^2})\htmlClass{sdt-0000000083}{\theta}_i^2\]

Note that this formulation assumes that we chose the origin (only zeros) as our parameters. If we chose a different point, the Taylor expansion around \( \htmlClass{sdt-0000000083}{\theta} \) results in a much more complex formula.

Example

Let's say that \[\htmlClass{sdt-0000000062}{R}(\htmlClass{sdt-0000000083}{\theta})=\htmlClass{sdt-0000000127}{\sin}^2(\htmlClass{sdt-0000000083}{\theta}_1)-\htmlClass{sdt-0000000124}{\cos}^2(\htmlClass{sdt-0000000083}{\theta}_2).\]

This function looks in the following way

Then, we can approximate this function with \[\hat{\htmlClass{sdt-0000000062}{R}}(\htmlClass{sdt-0000000083}{\theta}) = \htmlClass{sdt-0000000083}{\theta}_1^2 + \htmlClass{sdt-0000000083}{\theta}_2^2\]. If we plot this approximation, we see that the behavior of these functions is similar at the origin.

See this visualization on Desmos.

Your History

Approximation of Performance Landscape

Description

\[\htmlClass{sdt-0000000062}{R}(\htmlClass{sdt-0000000083}{\theta}) = \htmlClass{sdt-0000000080}{\sum}_{i=1}^D\htmlClass{sdt-0000000003}{x}_i\htmlClass{sdt-0000000083}{\theta}_i^2\]

Symbols Used:

Derivation

Example

References