We consider a linear regression game in which the covariates are known in advance: at each round, the learner predicts a real-value, the adversary reveals a label, and the learner incurs a squaredâ€¦ (More)

We consider an adversarial formulation of the problem of predicting a time series with square loss. The aim is to predict an arbitrary sequence of vectors almost as well as the best smooth comparatorâ€¦ (More)

- Yasin Abbasi-Yadkori, Peter L. Bartlett, Xi Chen, Alan Malek
- ICML
- 2015

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the optimal policy scales with the size of the state space,â€¦ (More)

- Wouter M. Koolen, Alan Malek, Peter L. Bartlett
- NIPS
- 2014

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. Weâ€¦ (More)

- Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek
- AISTATS
- 2017

We propose the Hit-and-Run algorithm for planning and sampling problems in nonconvex spaces. For sampling, we show the first analysis of the Hit-and-Run algorithm in non-convex spaces and show thatâ€¦ (More)

- Wojciech Kotlowski, Wouter M. Koolen, Alan Malek
- COLT
- 2016

We consider the online version of the isotonic regression pr oblem. Given a set of linearly ordered points (e.g., on the real line), the learner must predi ct labels sequentially at adversariallyâ€¦ (More)

- Jason Altschuler, Victor-Emmanuel Brunel, Alan Malek
- ArXiv
- 2018

This paper studies active learning in the context of robust statistics. Specifically, we propose a variant of the Best Arm Identification problem for contaminated bandits, where each arm pull hasâ€¦ (More)

- Wojciech Kotlowski, Wouter M. Koolen, Alan Malek
- NIPS
- 2017

We revisit isotonic regression on linear orders, the problem of fitting monotonic functions to best explain the data, in an online setting. It was previously shown that online isotonic regression isâ€¦ (More)

We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A simple random uniform learner obtains the optimal rate of error in the adversarial scenario. However,â€¦ (More)

- Yasin Abbasi-Yadkori, Peter L. Bartlett, Alan Malek
- ICML
- 2014

We consider the problem of controlling a Markov decision process (MDP) with a large state space, so as to minimize average cost. Since it is intractable to compete with the optimal policy for largeâ€¦ (More)