#### Filter Results:

#### Publication Year

1994

2000

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

In this paper we propose a new formal model for studying reinforcement learning, based on Valiant's PAC framework.
In our model the learner does not have direct access to every state of the environment. Instead, every sequence of experiments starts in a fixed initial state and the learner is provided with a “reset” operation that interrupts the… (More)

In many optimization and decision problems the objective function can be expressed as a linear combination of competing criteria, the weights of which specify the relative importance of the criteria for the user. We consider the problem of learning such a " subjective " function from preference judgments collected from traces of user interactions. We… (More)

We propose a model of eecient on-line reinforcement learning based on the expected mistake bound framework introduced by Haussler, Littlestone and Warmuth (1987). The measure of performance we use is the expected diierence between the total reward received by the learning agent and that received by an agent behaving optimally from the start. We call this… (More)

Current route advice systems present a single route to the driver based on static evaluation criteria, with little or no recourse if the driver finds this solution unsatisfactory. In this paper, we propose a more flexible approach and its implementation in the Adaptive Route Advisor. Our system behaves more like a human travel agent, using driver… (More)

Generating satisfactory routes for driving is a challenging task because the desirability of a particular route depends on many factors and varies from driver to driver. Current route advice systems present a single route to the driver based on static evaluation criteria, with little or no recourse if the driver finds this solution unsatisfactory. In this… (More)

- Seth Rogers, Claude-Nicolas Fiechter, Cynthia Thompson
- 2000

Our research group is investigating the use of adaptive user interfaces for in-car information access. These interfaces attempt to efficiently provide content the driver needs and wants, and gather feedback on these preferences through the driver's interaction with the system. In this way, the performance of the system improves as it unobtrusively builds a… (More)

We consider a special case of reinforcement learning where the environment can be described by a linear system. The states of the environment and the actions the agent can perform are represented by real vectors and the system dynamic is given by a linear equation with a stochastic component. The problem is equivalent to the so-called linear quadratic… (More)

- ‹
- 1
- ›