Learn More
We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal return is lower bounded by the mixing time T of the optimal policy (in the undiscounted(More)
A critical issue for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or infinite state spaces, traditional planning and reinforcement learning algorithms may be inapplicable, since their running time typically grows(More)
While most theoretical work in machine learning has focused on the complexity of learning, recently there has been increasing interest in formally studying the complexity o f teaching. In this paper we study the complexity of teaching by considering a variant of the on-line learning model in which a helpful teacher selects the instances. We measure the(More)
In this paper we initiate an investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions. The ultimate goal in this direction is informally termed <italic>agnostic learning</italic>, in which we make virtually no assumptions on the target function. The name(More)
In this paper, we prove the intractability of learning several classes of Boolean functions in the distribution-free model (also called the Probably Approximately Correct or PAC model) of learning from examples. These results are <italic>representation independent</italic>, in that they hold regardless of the syntactic form in which the learner chooses to(More)
Multi-agent games are becoming an increasingly prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of electronic marketplaces makes the study of computationally simple agents an appealing direction. In this work, we analyze the behavior of agents that incrementally(More)
We present new results on the well-studied problem of learning DNF expressions. We prove that an algorithm due to Kushilevitz and Mansour [13] can be used to weakly learn DNF formulas with membership queries with respect to the uniform distribution. This is the first positive result known for learning general DNF in polynomial time in a nontrivial model.(More)