Markovian Decision Process

Tags: adaptive-intelligence

A decision process [see page 5, where] the probability of transitioning to the next state depends only on the current state and not the whole previous history. These decision problems stick closely to the markovian property.

Markovian Property

Restates the description of a markov-decision-process mathematically.

\begin{align} \label{eq:markov-prop} P({S}_{t+1} {r}_{t+1} | {s}_{t}, {a}_{t}, {r}_{t}, {s}_{t-1}, {a}_{t-1}, {r}_{t-1}) = P({s}_{t+1}, {r}_{t+1} | {s}_{t}, {a}_{t}) \end{align}

Links to this note

SARSA Algorithm