#### Filter Results:

- Full text PDF available (142)

#### Publication Year

1992

2017

- This year (10)
- Last 5 years (44)
- Last 10 years (92)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

#### Method

Learn More

- Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
- IJCAI
- 2003

This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyperplane per point, it is able to… (More)

- Ajit Paul Singh, Geoffrey J. Gordon
- KDD
- 2008

Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations encode users' ratings of movies, movies' genres, and actors' roles in… (More)

- Stéphane Ross, Geoffrey J. Gordon, J. Andrew Bagnell
- AISTATS
- 2011

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches (Daumé III et al., 2009; Ross and Bagnell, 2010) provide stronger guarantees… (More)

- Geoffrey J. Gordon
- ICML
- 1995

The success of reinforcement learning in practical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experiments in this area have produced mixed results; there have been both notable successes and notable disappointments. Theory has been scarce, mostly due to the difculty of reasoning… (More)

- Maxim Likhachev, Geoffrey J. Gordon, Sebastian Thrun
- NIPS
- 2003

In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasible solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It… (More)

- Håkan Lorens, S. Younes, +4 authors Håkan L. S. Younes
- 2005

Asynchronous stochastic systems are abundant in the real world. Examples include queuing systems, telephone exchanges, and computer networks. Yet, little attention has been given to such systems in the model checking and planning literature, at least not without making limiting and often unrealistic assumptions regarding the dynamics of the systems. The… (More)

- Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon
- Robotics: Science and Systems
- 2010

A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex… (More)

- Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
- J. Artif. Intell. Res.
- 2006

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing… (More)

- Peng Yang, Randy A. Freeman, Geoffrey J. Gordon, Kevin M. Lynch, Siddhartha S. Srinivasa, Rahul Sukthankar
- Automatica
- 2010

The ability of a robot team to reconfigure itself is useful in many applications: for metamorphic robots to change shape, for swarm motion towards a goal, for biological systems to avoid predators, or for mobile buoys to clean up oil spills. Inmany situations, auxiliary constraints, such as connectivity between team members or limits on the maximum… (More)

MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start state to a goal, restricting computation to states relevant to this task can make much larger problems tractable. We introduce a new algorithm, Bounded RTDP, which can produce… (More)