Definition
Reward is a scalar feedback indicating how well the agent is doing at step .
There are three reward types:
- : the reward in the state .
- : the reward by the action in the state .
- : the reward received after transitioning from state to due to action .
Expected Reward
Under known dynamics of all transitions , one can compute the followings:
-
State transition probability
-
Expected reward for (state, action) pair
-
Expected reward for (state, action, next state) triple
Facts
Reinforcement learning is based on the Reward Hypothesis.