Definition

A learning Policy is called GLIE (Greedy in Limit with Infinite Exploration) if it satisfies:

  • All state-action pairs are explored infinitely many times where is incremental count of a .
  • The learning policy converges to a greedy policy. where