We present a nonlinear version of the recursive least squares (RLS) algorithm that can be used to recursively construct minimum mean-squared-error solutions to nonlinear least-squares problems that are frequently encountered in signal processing applications.Expand

We show how given an algorithm for the PAC model Multi-armed Bandit problem, one can derive a batch learningalg orithm for Markov Decision Processes.Expand

We derive generalization bounds for learning algorithms based on their robustness: the property that if a testing sample is “similar” to a training sample, then the testing error is close to the training error.Expand

We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model, and derive a matching lower bound that holds for any sampling policy.Expand

We consider a resource allocation problem where individual users wish to send data across a network to maximize their utility, and a cost is incurred at each link that depends on the total rate sent through the link.Expand

In this paper, we present a set of percentile criteria that are conceptually natural and representative of the trade-off between optimistic and pessimistic views of the question.Expand