In this paper, we propose a model to develop robotspsila covert and overt behaviors by using reinforcement and supervised learning jointly. The covert behaviors are handled by a motivational system, which is achieved through reinforcement learning. The overt behaviors are directly selected by imposing supervised signals. Instead of dealing with problems in controlled environments with a low-dimensional state space, our model is applied for the learning in non-stationary environments. Locally balanced incremental hierarchical discriminant regression (LBIHDR) tree is introduce to be the engine of cognitive mapping. Its balanced coarse-to-fine tree structure guarantees real-time retrieval in self-generated high-dimensional state space. Furthermore, K-nearest neighbor strategy is adopted to reduce training time complexity. Vision-based outdoor navigation are used as challenging task examples. In the experiment, the mean square error of heading direction is 0deg for re-substitution test and 1.1269deg for disjoint test, which allows the robot to drive without a big deviation from the correct path we expected. Compared with IHDR (W.S. Hwang and J. Weng, 2007), LBIHDR reduced the mean square error by 0.252deg and 0.5052deg, using re-substitution and disjoint test, respectively.