Learn More
We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal return is lower bounded by the mixing time T of the optimal policy (in the undiscounted(More)
An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochas-tic environments with very large or even inn-nite state spaces, traditional planning and reinforcement learning algorithms are often in-applicable, since their running time(More)
In this paper, we prove the intractability of learning several classes of Boolean functions in the distribution-free model (also called the Probably Approximately Correct or PAC model) of learning from examples. These results are <italic>representation independent</italic>, in that they hold regardless of the syntactic form in which the learner chooses to(More)
Multi-agent games are becoming an increasingly prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of electronic marketplaces makes the study of computationally simple agents an appealing direction. In this work, we analyze the behavior of agents that incrementally(More)
In this paper we investigate a new formal model of machine learning in which the concept (boolean function) to be learned may exhibit uncertain or probabilistic behavior|thus, the same input may sometimes be classiied as a positive example and sometimes as a negative example. Such probabilistic concepts (or p-concepts) may arise in situations such as(More)