Using reinforcement learning to optimize occupant comfort and energy usage in HVAC systems