#### Filter Results:

- Full text PDF available (37)

#### Publication Year

1989

2017

- This year (2)
- Last 5 years (6)
- Last 10 years (13)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

Learn More

- Peter L. Bartlett, Jonathan Baxter
- J. Artif. Intell. Res.
- 2001

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problemsâ€¦ (More)

- Jonathan Baxter
- J. Artif. Intell. Res.
- 2000

A major problem in machine learning is that of inductive bias: how to choose a learnerâ€™s hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet smallâ€¦ (More)

Much recent attention, both experimental and theoretical, has been focussed on classication algorithms which produce voted combinations of classi ers. Recent theoretical work has shown that theâ€¦ (More)

Much recent attention, both experimental and theoretical, has been focussed on classii-cation algorithms which produce voted combinations of classiiers. Recent theoretical work has shown that theâ€¦ (More)

- Peter L. Bartlett, Jonathan Baxter, Lex Weaver
- J. Artif. Intell. Res.
- 2001

In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP). These algorithms are based on GPOMDP, an algorithmâ€¦ (More)

- Evan Greensmith, Peter L. Bartlett, Jonathan Baxter
- Journal of Machine Learning Research
- 2001

We consider the use of two additive control variate methods to reduce the variance of performance gradient estimates in reinforcement learning problems. The first approach we consider is the baselineâ€¦ (More)

- Jonathan Baxter
- Machine Learning
- 1997

A Bayesian model of learning to learn by sampling from multiple tasks is presented. The multiple tasks are themselves generated by sampling from a distribution over an environment of related tasks.â€¦ (More)

- Jonathan Baxter, Peter L. Bartlett
- ICML
- 2000

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled POMDPs. We introduce GPOMDP, a REINFORCE-likeâ€¦ (More)

Abstract Despite their many empirical successes, approximate value-function based approaches to reinforcement learning suffer from a paucity of theoretical guarantees on the performance of the policyâ€¦ (More)

- Llew Mason, Peter L. Bartlett, Jonathan Baxter
- Machine Learning
- 2000

Recent theoretical results have shown that the generalization performance of thresholded convex combinations of base classifiers is greatly improved if the underlying convex combination has largeâ€¦ (More)