#### Filter Results:

- Full text PDF available (13)

#### Publication Year

2006

2016

- This year (0)
- Last 5 years (3)
- Last 10 years (13)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Brain Region

#### Key Phrases

#### Method

#### Organism

Learn More

- Hamid Reza Maei
- 2011

We present a new family of gradient temporal-difference (TD) learning methods with function approximation whose complexity, both in terms of memory and per-time-step computation, scales linearly with the number of learning parameters. TD methods are powerful prediction techniques, and with function approximation form a core part of modern reinforcement… (More)

- Cátia M Teixeira, Stephen R Pomedli, Hamid R Maei, Nohjin Kee, Paul W Frankland
- The Journal of neuroscience : the official…
- 2006

Although the hippocampus plays a crucial role in the formation of spatial memories, as these memories mature they may become additionally (or even exclusively) dependent on extrahippocampal structures. However, the identity of these extrahippocampal structures that support remote spatial memory is currently not known. Using a Morris water-maze task, we show… (More)

We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the number of features. Our algorithm, Greedy-GQ, is an extension of recent work on gradient temporal-difference learning, which has hitherto been restricted to a prediction (policy… (More)

A new family of gradient temporal-difference learning algorithms have recently been introduced by Sutton, Maei and others in which function approximation is much more straightforward. In this paper, we introduce the GQ(λ) algorithm which can be seen as extension of that work to a more general setting including eligibility traces and off-policy learning of… (More)

We introduce the first temporal-difference learning algorithms that converge with smooth value function approximators, such as neural networks. Conventional temporal-difference (TD) methods, such as TD(λ), Q-learning and Sarsa have been used successfully with function approximation in many applications. However, it is well known that off-policy sampling, as… (More)

- Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei
- NIPS
- 2008

We introduce the first temporal-difference learning algorithm that is stable with linear function approximation and off-policy training, for any finite Markov decision process, behavior policy, and target policy, and whose complexity scales linearly in the number of parameters. We consider an i.i.d. policy-evaluation setting in which the data need not come… (More)

- Richard S. Sutton, Hamid Reza Maei, +4 authors Eric Wiewiora
- ICML
- 2009

Sutton, Szepesvári and Maei (2009) recently introduced the first temporal-difference learning algorithm compatible with both linear function approximation and off-policy training, and whose complexity scales only linearly in the size of the function approximator. Although their <i>gradient temporal difference</i> (GTD) algorithm converges reliably, it… (More)

We develop an off-policy actor–critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view toward its use in mobile health. 1 ar X iv :1 60 7. 05 04 7v 1 [ st at .M L ] 1 8 Ju l 2 01 6

- Hamid R. Maei, Kirill Zaslavsky, Cátia M. Teixeira, Paul W. Frankland
- Front. Integr. Neurosci.
- 2009

The water maze is commonly used to assay spatial cognition, or, more generally, learning and memory in experimental rodent models. In the water maze, mice or rats are trained to navigate to a platform located below the water's surface. Spatial learning is then typically assessed in a probe test, where the platform is removed from the pool and the mouse or… (More)