Learn More
This paper introduces Learn Structure and Exploit RMax (LSE-RMax), a novel model based structure learning algorithm for er-godic factored-state MDPs. Given a planning horizon that satisfies a condition, LSE-RMax provably guarantees a return very close to the optimal return, with a high certainty, without requiring any prior knowledge of the in-degree of the(More)
In the Trading Agent Competition Ad Auctions Game, agents compete to sell products by bidding to have their ads shown in a search engine's sponsored search results. We report on the winning agent from the first (2009) competition, TacTex. TacTex operates by estimating the full game state from limited information, using these estimates to make predictions,(More)
The traditional agenda in Multiagent Learning (MAL) has been to develop learners that guarantee convergence to an equilibrium in self-play or that converge to playing the best response against an opponent using one of a fixed set of known targeted strategies. This paper introduces an algorithm called Learn or Exploit for Adversary Induced Markov Decision(More)
Knowledge transfer between expert and novice agents is a challenging problem given that the knowledge representation and learning algorithms used by the novice learner can be fundamentally different from and inaccessible to the expert trainer. We are particularly interested in team tasks, robotic or otherwise, where new teammates need to replace currently(More)
version that appeared in the official ICML proceedings. The only substantive change is due to the fact that, based on subsequent discussions with peers, we identified a technical flaw in the ways that our MLeS and CMLeS algorithms were guaranteeing safety. Specifically, it was possible that MLeS (and also CM-LeS) may converge to modeling an arbitrary(More)
In recent years, great strides have been made towards creating autonomous agents that can learn via interaction with their environment. When considering just an individual agent, it is often appropriate to model the world as being stationary, meaning that the same action from the same state will always yield the same (possibly stochastic) effects. However,(More)
Communication is a key tool for facilitating multiagent coordination in cooperative and uncertain domains. We focus on a class of multiagent problems modeled as Decentralized Markov Decision Processes with Communication (DEC-MDP-COM) with local observability. The planning problem for computing the optimal communication strategy in such domains is often(More)
We believe that intelligent information agents will represent their users interest in electronic marketplaces and other forums to trade, exchange, share, identify, and locate goods and services. Such information worlds will present unforeseen opportunities as well as challenges that can be best addressed by robust, self-sustaining agent communities. An(More)