Learn More
Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical(More)
The network architectures of the proposed models and the baselines are illustrated in Figure 1. The weight of LSTM is initialized from a uniform distribution of [−0.08, 0.08]. The weight of the fully-connected layer from the encoded feature to the factored layer and from the action to the factored layer are initialized from a uniform distribution of [−1, 1](More)
—Designers of artificial agents have goals and purposes that implicitly define a preference over possible agent behaviors. Nevertheless, it is rare that the designer knows the most preferred behavior in a form that allows it to be simply programmed into the agent. Instead, in building autonomous agents it is often more robust and useful for the designer to(More)
BACKGROUND Mobile health (mHealth) services cannot easily adapt to users' unique needs. PURPOSE We used simulations of text messaging (SMS) for improving medication adherence to demonstrate benefits of interventions using reinforcement learning (RL). METHODS We used Monte Carlo simulations to estimate the relative impact of an intervention using RL to(More)
We propose a framework for including information-processing bounds in rational analyses. It is an application of bounded optimality (Russell & Subramanian, 1995) to the challenges of developing theories of mechanism and behavior. The framework is based on the idea that behaviors are generated by cognitive mechanisms that are adapted to the structure of not(More)
Utility maximization is a key element of a number of theoretical approaches to explaining human behavior. Among these approaches are rational analysis, ideal observer theory, and signal detection theory. While some examples of these approaches define the utility maximization problem with little reference to the bounds imposed by the organism, others start(More)
We propose a framework for including information processing bounds in rational analyses. It is an application of bounded optimality (Russell & Subramanian, 1995) to the challenges of developing theories of mechanism and behavior. e framework is based on the idea that behaviors are generated by cognitive mechanisms that are adapted to the structure of not(More)
This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future(More)