Your History

Menu

Markov Transition Matrix Entries

Prerequisites

Markov Property | \(p\left( \mathbf{X}_{n+1} = \mathbf{x}_j \,\vert\, \mathbf{X}_{n} = \mathbf{x}_i, ... , \mathbf{X}_{1} = \mathbf{x}_1 \right) = p\left( \mathbf{X}_{n+1} = \mathbf{x}_j \,\vert\, \mathbf{X}_{n} = \mathbf{x}_i \right)\)

Description

A Markov transition matrix encodes essential information about discrete stochastic systems with finite state spaces. It can be used for determining the probability distribution of the next state \( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}+1} \) at time \( \htmlClass{sdt-0000000117}{n}+1 \), given the distribution of the current state \( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} \) at time \( \htmlClass{sdt-0000000117}{n} \).

\[\left[ \htmlClass{sdt-0000000087}{T} \right]_{\htmlClass{sdt-0000000018}{i},\htmlClass{sdt-0000000011}{j}} = p\left( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}+1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000011}{j}} \,\vert\, \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000018}{i}} \right)\]

Symbols Used:

This symbol represents a Markov transition matrix.

\( X \)

This symbol represents a random variable. It is a measurable function that maps a sample space of possible outcomes to a a measurable space.

\( j \)

This is a secondary symbol for an iterator, a variable that changes value to refer to a series of elements

\( i \)

This is the symbol for an iterator, a variable that changes value to refer to a sequence of elements.

\( \mathbf{x} \)

This symbol represents a state of the dynamical system at some time point.

\( n \)

This symbol represents any given whole number, \( n \in \htmlClass{sdt-0000000014}{\mathbb{W}}\).

Derivation

  1. Consider a dynamical system with \( \vert \htmlClass{sdt-0000000038}{\mathcal{X}} \vert = n\) states:
    \[ \htmlClass{sdt-0000000038}{\mathcal{X}} = \left\{ \htmlClass{sdt-0000000046}{\mathbf{x}}_1, \htmlClass{sdt-0000000046}{\mathbf{x}}_2, ... , \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000117}{n}} \right\} \]
  2. For stochastic systems, the transition from state \( \htmlClass{sdt-0000000046}{\mathbf{x}}(\htmlClass{sdt-0000000117}{n}) \) at time \( \htmlClass{sdt-0000000117}{n} \) into the state \( \htmlClass{sdt-0000000046}{\mathbf{x}}(\htmlClass{sdt-0000000117}{n}+1) \) at time \( \htmlClass{sdt-0000000117}{n}+1 \) is not necessarily uniquely determined.
  3. This makes the state at any time \( \htmlClass{sdt-0000000117}{n} \) an instance of a random variable, which we note as \( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} \).
  4. The Markov property simplifies the stochastic behaviour to a single time step - transitions depend only on the previous state. This means conditioning on any more than one previous state is unnecessary:
    \[p\left( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}+1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_j \,\vert\, \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} = \htmlClass{sdt-0000000046}{\mathbf{x}}_i, ... , \mathbf{\htmlClass{sdt-0000000005}{X}}_{1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_1 \right) = p\left( \mathbf{X}_{\htmlClass{sdt-0000000117}{n}+1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_j \,\vert\, \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} = \htmlClass{sdt-0000000046}{\mathbf{x}}_i \right)\]
  5. For a finite number of states where transitions are possible between any two states within \( \htmlClass{sdt-0000000038}{\mathcal{X}} \), a matrix matches our needs of encoding all possible transitions:
    \[ T = \begin{bmatrix} p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\ p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\ p_{3,1} & p_{3,2} & \cdots & p_{3,n} \\ \vdots & \vdots & \ddots & \vdots \\ p_{n,1} & p_{n,2} & \cdots & p_{n,n} \end{bmatrix},\]
    where \( p_{\htmlClass{sdt-0000000018}{i}, \htmlClass{sdt-0000000011}{j}} \) is the probability of transition from state \( \htmlClass{sdt-0000000018}{i} \) to state \( \htmlClass{sdt-0000000011}{j} \):
    \[ p_{\htmlClass{sdt-0000000018}{i},\htmlClass{sdt-0000000011}{j}} = p\left( \mathbf{X}_{n+1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000011}{j}} \,\vert\, \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000018}{i}} \right). \]
    (Note: how row sums will equal 1, as transition to some state is guaranteed at the next time step regardless of the current state.)
  6. The elements of the matrix correspond to the probabilities:
    \[ \left[ \htmlClass{sdt-0000000087}{T} \right]_{\htmlClass{sdt-0000000018}{i} ,\htmlClass{sdt-0000000011}{j}} = p_{\htmlClass{sdt-0000000018}{i},\htmlClass{sdt-0000000011}{j}} = p\left( \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}+1} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000011}{j}} \,\vert\, \mathbf{\htmlClass{sdt-0000000005}{X}}_{\htmlClass{sdt-0000000117}{n}} = \htmlClass{sdt-0000000046}{\mathbf{x}}_{\htmlClass{sdt-0000000018}{i}} \right) \]
    as required.

Example

Consider the following Markov chain with two possible states: \( \htmlClass{sdt-0000000038}{\mathcal{X}} = \{A, B \} \):

Two-state Markov Chain
  1. From state \( A \), there is a 60% probability of staying in state \( A \) and 40% probability of transitioning to \( B \). This gives us \( p_{A, A} = 0.6 \) and \( p_{A, B} = 0.4 \).
  2. From state \( B \), there is a 10% probability of staying in state \( B \) and 90% probability of transitioning to \( A \). This gives us \( p_{B, A} = 0.9 \) and \( p_{B, B} = 0.1 \).
  3. The transition matrix is then:
    \[ T = \begin{bmatrix} 0.6 & 0.4 \\ 0.9 & 0.1 \end{bmatrix} \]
  4. The matrix representation of the update operator allows us to find the probability distribution for the next state \( \mathbf{X}__{n+1} \) given that of \( \mathbf{X}_{n} \).
  5. If we know for sure that \( \mathbf{X}_n = A \), this can be encoded in the row vector:
    \[ \htmlClass{sdt-0000000046}{\mathbf{x}}_{n} = \begin{bmatrix} 1 & 0 \end{bmatrix} \]
  6. Multiplying with the transition matrix on the right, we get:
    \[ \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} 0.6 & 0.4 \\ 0.9 & 0.1 \end{bmatrix} = \begin{bmatrix} 0.6 & 0.4 \end{bmatrix}, \]
    so at time \( n+1 \) we have 60% probability of being in state \(A\) and 40% of being in state \(B\), as expected.

References

  1. Jaeger, H. (n.d.). Neural Networks (AI) (WBAI028-05) Lecture Notes BSc program in Artificial Intelligence. Retrieved May 17, 2024, from https://www.ai.rug.nl/minds/uploads/LN_NN_RUG.pdf