Learn More
Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have(More)
Keepaway soccer has been previously put forth as a testbed for machine learning. Although multiple researchers have used it successfully for machine learning experiments, doing so has required a good deal of domain expertise. This paper introduces a set of programs, tools, and resources designed to make the domain easily usable for experimentation without(More)
Ant robots are simple creatures with limited sensing and computational capabilities. They have the advantage that they are easy to program and cheap to build. This makes it feasible to deploy groups of ant robots and take advantage of the resulting fault tolerance and parallelism. We study, both theoretically and in simulation, the behavior of ant robots(More)
Transfer learning concerns applying knowledge learned in one task (the source) to improve learning another related task (the target). In this paper, we use structure mapping, a psychological and computational theory about analogy making, to find mappings between the source and target tasks and thus construct the transfer functional automatically. Our(More)
Sodium/hydrogen transporters transfer ions across membranes and thus play an important role in pH and electrolyte homeostasis. To further understand the mechanism and function of the plant vacuolar Na+/H+, a new Na+/H+ antiporter, named DmNHX1, was isolated from chrysanthemum (Dendranthema morifolium) and characterized. The total length of DmNHX1 is 1,897(More)
In this paper, we study a simple means for coordinating teams of simple agents. In particular, we study ant robots and how they can cover terrain once or repeatedly by leaving markings in the terrain, similar to what ants do. These markings can be sensed by all robots and allow them to cover terrain even if they do not communicate with each other except via(More)
We present half field offense, a novel subtask of RoboCup simulated soccer, and pose it as a problem for reinforcement learning. In this task, an offense team attempts to outplay a defense team in order to shoot goals. Half field offense extends keepaway [11], a simpler subtask of RoboCup soccer in which one team must try to keep possession of the ball(More)
We study how to find plans that maximize the expected total utility for a given MDP, a planning objective that is important for decision making in high-stakes domains. The optimal actions can now depend on the total reward that has been accumulated so far in addition to the current state. We extend our previous work on functional value iteration from(More)
Reinforcement learning is a paradigm under which an agent seeks to improve its policy by making learning updates based on the experiences it gathers through interaction with the environment. Model-free algorithms perform updates solely bas ed on observed experiences. By contrast, model-based algorithms learn a model of the environment that effectively(More)