#### Filter Results:

- Full text PDF available (17)

#### Publication Year

2002

2017

- This year (0)
- Last 5 years (1)
- Last 10 years (6)

#### Publication Type

#### Co-author

#### Journals and Conferences

Learn More

- Alexander L. Strehl, Michael L. Littman
- J. Comput. Syst. Sci.
- 2008

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation… (More)

For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm---Delayed Q-Learning. We prove it is PAC, achieving near optimal performance… (More)

- Alexander L. Strehl, Lihong Li, Michael L. Littman
- Journal of Machine Learning Research
- 2009

We study the problem of learning near-optimal behavior in fin ite Markov Decision Processes (MDPs) with a polynomial number of samples. These “PAC-MDP” algorithms include the wellknown E3 and R-MAX… (More)

- Alexander L. Strehl, Michael L. Littman
- ICML
- 2005

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation… (More)

- Alexander L. Strehl, Carlos Diuk, Michael L. Littman
- AAAI
- 2007

We consider the problem of reinforcement learning in factored-state MDPs in the setting in which learning is conducted in one long trial with no resets allowed. We show how to extend existing… (More)

We consider the problem of estimating the conditional probability of a label in time O(log n), where n is the number of possible labels. We analyze a natural reduction of this problem to a set of… (More)

- Alexander L. Strehl, John Langford, Sham M. Kakade
- NIPS
- 2010

We provide a sound and consistent foundation for the use of nonrandomexploration data in “contextual bandit” or “partially labeled” s ettings where only the value of a chosen action is learned. The… (More)

- Alexander L. Strehl, Michael L. Littman
- NIPS
- 2007

We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and… (More)

- Alexander L. Strehl, Michael L. Littman
- 16th IEEE International Conference on Tools with…
- 2004

This work takes an empirical approach to evaluating three model-based reinforcement-learning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with… (More)

- Qinfeng Shi, James Petterson, +4 authors S. V. N. Vishwanathan
- AISTATS
- 2009

We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces.… (More)