#### Filter Results:

- Full text PDF available (283)

#### Publication Year

1989

2017

- This year (19)
- Last 5 years (73)
- Last 10 years (147)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra
- Artif. Intell.
- 1998

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm for solving pomdps o line and show how, in some cases, a… (More)

- Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore
- J. Artif. Intell. Res.
- 1996

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the eld and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error… (More)

- Michael L. Littman
- ICML
- 1994

In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior. The framework of Markov games allows us to widen this… (More)

- Peter D. Turney, Michael L. Littman
- ACM Trans. Inf. Syst.
- 2003

The evaluative character of a word is called its <i>semantic orientation</i>. Positive semantic orientation indicates praise (e.g., "honest", "intrepid") and negative semantic orientation indicates criticism (e.g., "disturbing", "superfluous"). Semantic orientation varies in both direction (positive or negative) and degree (mild to strong). An automated… (More)

Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems, existing techniques for nding optimal behavior do not appear to scale well and have been unable… (More)

Activity recognition fits within the bigger framework of context awareness. In this paper, we report on our efforts to recognize user activity from accelerometer data. Activity recognition is formulated as a classification problem. Performance of base-level classifiers and meta-level classifiers is compared. Plurality Voting is found to perform consistently… (More)

- Justin A. Boyan, Michael L. Littman
- NIPS
- 1993

This paper describes the Q-routing algorithm for packet routing, in which a reinforcement learning module is embedded into each node of a switching network. Only local communication is used by each node to keep accurate statistics on which routing decisions lead to minimal delivery times. In simple experiments involving a 36-node, irregularly connected… (More)

In this paper, we describe the partially observable Markov decision process (pomdp) approach to nding optimal or near-optimal control strategies for partially observable stochastic environments, given a complete model of the environment. The pomdp approach was originally developed in the operations research community and provides a formal basis for planning… (More)

We introduce a compact graph-theoretic representation for multi-party game theory. Our main result is a provably correct and efficient algorithm for computing approximate Nash equilibria in one-stage games represented by trees or sparse graphs.

Most exact algorithms for general par tially observable Markov decision processes (POMDPs) use a form of dynamic program ming in which a piecewise-linear and con vex representation of one value function is transformed into another. We examine vari ations of the "incremental pruning" method for solving this problem and compare them to earlier algorithms… (More)