We consider a partially observable Markov decision process (POMDP) model for improving a taxi agent cruising decision in a congested urban city. Using real-world data provided by a large taxi company in Singapore as a guide, we derive the state transition function of the POMDP. Specifically, we model the cruising behavior of the drivers as continuous-time Markov chains. We then apply dynamic programming algorithm for finding the optimal policy of the driver agent. Using a simulation, we show that this policy is significantly better than a greedy policy in congested road network.