Satinder Singh

Learn More
Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical(More)
We propose a framework for including information-processing bounds in rational analyses. It is an application of bounded optimality (Russell & Subramanian, 1995) to the challenges of developing theories of mechanism and behavior. The framework is based on the idea that behaviors are generated by cognitive mechanisms that are adapted to the structure of not(More)
The network architectures of the proposed models and the baselines are illustrated in Figure 1. The weight of LSTM is initialized from a uniform distribution of [−0.08, 0.08]. The weight of the fully-connected layer from the encoded feature to the factored layer and from the action to the factored layer are initialized from a uniform distribution of [−1, 1](More)
This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future(More)
Designers of artificial agents have goals and purposes that implicitly define a preference over possible agent behaviors. Nevertheless, it is rare that the designer knows the most preferred behavior in a form that allows it to be simply programmed into the agent. Instead, in building autonomous agents it is often more robust and useful for the designer to(More)
We consider the problem of implementing a system-optimal decision policy in the context of self-interested agents with private state in an uncertain world. Unique to our model is that we allow both persistent agents, with an agent having a local MDP model to describe how its local world evolves given actions by a center, and also periodically-inaccessible(More)
BACKGROUND Interactive voice response (IVR) calls enhance health systems' ability to identify health risk factors, thereby enabling targeted clinical follow-up. However, redundant assessments may increase patient dropout and represent a lost opportunity to collect more clinically useful data. OBJECTIVE We determined the extent to which previous IVR(More)