Online Learning under Delayed Feedback
- Pooria Joulani, A. György, Csaba Szepesvari
- Computer ScienceInternational Conference on Machine Learning
- 4 June 2013
Meta-algorithms are given that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop, and modify the well-known UCB algorithm for the bandit problem with delayed feedback.
Online Markov Decision Processes Under Bandit Feedback
- Gergely Neu, A. György, Csaba Szepesvari, A. Antos
- Computer ScienceIEEE Transactions on Automatic Control
- 6 December 2010
It is shown that after T time steps, the expected regret of this algorithm (more precisely, a slightly modified version thereof) is O(T1/2lnT), giving the first rigorously proven, essentially tight regret bound for the problem.
Degenerate Feedback Loops in Recommender Systems
- Ray Jiang, S. Chiappa, Tor Lattimore, A. György, Pushmeet Kohli
- Computer ScienceAAAI/ACM Conference on AI, Ethics, and Society
- 27 January 2019
A novel theoretical analysis is provided that examines both the role of user dynamics and the behavior of recommender systems, disentangling the echo chamber from the filter bubble effect, and offers practical solutions to slow down system degeneracy.
Average age of information with hybrid ARQ under a resource constraint
- Elif Tuğçe Ceran, Deniz Gündüz, A. György
- Computer ScienceIEEE Wireless Communications and Networking…
- 1 October 2017
An average-cost reinforcement learning (RL) algorithm is proposed that learns the system parameters and the transmission policy in real time for an unknown environment and the effectiveness of the proposed methods are verified through numerical simulations.
Individual convergence rates in empirical vector quantizer design
- A. Antos, L. Györfi, A. György
- Computer ScienceIEEE Transactions on Information Theory
- 1 November 2005
It is proved that for any fixed distribution supported on a given finite set the convergence rate is O(1/n) (faster than the minimax lower bound), where the corresponding constant depends on the source distribution.
Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection
- Andrea Paudice, Luis Muñoz-González, A. György, Emil C. Lupu
- Computer ScienceArXiv
- 8 February 2018
This paper proposes a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection, and shows empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack.
The adversarial stochastic shortest path problem with unknown transition probabilities
- Gergely Neu, A. György, Csaba Szepesvari
- Computer Science, MathematicsInternational Conference on Artificial…
- 21 March 2012
This paper proposes an algorithm called “follow the perturbed optimistic policy”, an algorithm that learns and controls stochastic and adversarial components in an online fashion at the same time, and it is proved that the expected cumulative regret of the algorithm is of order L||A| p T up to logarithmic factors.
Near-optimal max-affine estimators for convex regression
- G. Balázs, A. György, Csaba Szepesvari
- MathematicsInternational Conference on Artificial…
- 21 February 2015
These least squares estimators for regression problems over convex, uniformly bounded, uniformly Lipschitz function classes minimizing the empirical risk over max-affine functions are proved to achieve the optimal rate of convergence up to logarithmic factors.
The On-Line Shortest Path Problem Under Partial Monitoring
- A. György, T. Linder, G. Lugosi, G. Ottucsák
- Computer ScienceJournal of machine learning research
- 8 April 2007
The on-line shortest path problem is considered under various models of partial monitoring, and a version of the multi-armed bandit setting for shortest path is discussed, where the decision maker learns only the total weight of the chosen path but not the weights of the individual edges on the path.
Efficient Tracking of Large Classes of Experts
- A. György, T. Linder, G. Lugosi
- Computer ScienceIEEE Transactions on Information Theory
- 12 October 2011
This paper provides a method that can transform any prediction algorithm that is designed for the base class into a tracking algorithm, and can take advantage of the prediction performance and potential computational efficiency of <i>A</i> in the sense that it can be implemented with time and space complexity only three times larger than that of A.
...
...