Tatsuji Takahashi

Learn More
Some of the authors have previously proposed a cognitively inspired reinforcement learning architecture (LS-Q) that mimics cognitive biases in humans. LS-Q adaptively learns under uniform, coarse-grained state division and performs well without parameter tuning in a giant-swing robot task. However, these results were shown only in simulations. In this(More)
When we learn from unknown environment to collect reward, we face speed-accuracy trade-off for the decision-making that agents act. We will lose if we continue to act greedily, but we cannot maximize reward if we search continually. From experience, it is assumed that human beings act with some kind of standards to cope with trade-off. Hence, we focused(More)