Selective attention involves the differential processing of different stimuli, and has widespread psychological and neural consequences. Although computational modeling should offer a powerful way of… (More)

C. R. Gallistel and J. Gibbon (2000) presented quantitative data on the speed with which animals acquire behavioral responses during autoshaping, together with a statistical model of learning… (More)

We provide the first algorithm for online bandit linear optimization whose regret after T rounds is of order √ Td lnN on any finite class X ⊆ R of N actions, and of order d √ T (up to log factors)… (More)

Recall that: L(w) = 1 n E‖Xw − Y ‖ = 1 n E‖Xw − E[Y ]‖ + σ Define our “empirical loss” as: L̂(w) = 1 n ‖Xw − Y ‖ which has no expectation over Y . Note that for a fixed w E[L̂(w)] = L(w) e.g. the… (More)

We have recently been studying the case where have a training set T generated from an underlying distribution and our goal is to find some good hypothesis, with respect to the true underlying… (More)

Policy gradient methods have enjoyed great success in deep reinforcement learning but suffer from high variance of gradient estimates. The high variance problem is particularly exasperated in… (More)

3 Motivation of Empirical Process Consider learning problem with observations Zi = (Xi, Yi), prediction rule f(Xi) and loss function L(f(Xi), Yi). Assume further that f is parameterized by θ ∈ Θ as… (More)

But what about high dimensions? What is the density of the points near the mean? And how far away is the average point from it’s component mean? Let us address this questions for a single isotropic… (More)

In an online linear optimization problem, on each period t, an online algorithm chooses st ∈ S from a fixed (possibly infinite) setS of feasible decisions. Nature (who may be adversarial) choo ses a… (More)