Using Control Theory for Analysis of Reinforcement Learning and Optimal Policy Properties in Grid-World Problems