Yohei Takamori

  • Citations Per Year
Learn More
In this paper, we propose a Profit Sharing that can learn deterministic policy for POMDPs environments. The proposed method can obtain the deterministic policy by using the history of observations. In the proposed method, the states in the perceptual aliasing are detected. Here, the perceptual aliasing means that different states can be perceived as the(More)
  • 1