Share This Author
Reachability in two-clock timed automata is PSPACE-complete
Exponential Lower Bounds for Policy Iteration
- John Fearnley
- Computer Science, MathematicsICALP
- 17 March 2010
This work extends lower bounds to Markov decision processes with the total reward and average-reward optimality criteria to show policy iteration style algorithms have exponential lower bounds in a two player game setting.
Market Making via Reinforcement Learning
A high-fidelity simulation of limit order book markets is developed, and a market making agent using temporal-difference reinforcement learning is designed using a linear combination of tile codings as a value function approximator and a custom reward function that controls inventory risk.
An ordered approach to solving parity games in quasi polynomial time and quasi linear space
A first implementation for a quasi-polynomial algorithm is provided, and a number of side results are provided, including minor algorithmic improvements, a quasi bi-linear complexity in the number of states and edges for a fixed number of colours, and matching lower bounds for the algorithm of Calude et al.
Non-oblivious Strategy Improvement
- John Fearnley
- Computer ScienceLPAR
- 15 March 2010
A structural property of these games is described, and it is shown that these structures can affect the behaviour of strategy improvement and can be used to accelerate strategy improvement algorithms.
Learning equilibria of games via payoff queries
This work studies a corresponding computational learning model, and the query complexity of learning equilibria for various classes of games, and has the stronger result that an equilibrium can be identified while only learning a small fraction of the cost values.
Computing Approximate Nash Equilibria in Polymatrix Games
- Argyrios Deligkas, John Fearnley, Rahul Savani, P. Spirakis
- Economics, Computer ScienceAlgorithmica
- 12 September 2014
Inspired by the algorithm of Tsaknakis and Spirakis , the algorithm uses gradient descent style approach on the maximum regret of the players and can be applied to efficiently find a 0.5+δ)-Nash equilibrium in a two-player Bayesian game.
The Complexity of the Simplex Method
This paper uses the known connection between Markov decision processes (MDPs) and linear programming, and an equivalence between Dantzig's pivot rule and a natural variant of policy iteration for average-reward MDPs to prove that it is PSPACE-complete to find the solution that is computed by the simplex method using Dantzes' pivot rule.
Distributed Methods for Computing Approximate Equilibria
- A. Czumaj, Argyrios Deligkas, M. Fasoulakis, John Fearnley, M. Jurdzinski, Rahul Savani
- Computer Science, EconomicsAlgorithmica
- 10 December 2015
A new, distributed method to compute approximate Nash equilibria in bimatrix games that first solves two independent LPs, each of which is derived from one of the two payoff matrices, and then computes an approximate Nash equilibrium using only limited communication between the players.
Unique End of Potential Line
UEOPL is shown to be a promise-subclass of EOPL in which the line in the End-of-Potential-Line instance is guaranteed to be unique via a promise, and OPDC lies in UEOPL, and the results imply that parity games, mean-payoff games, discounted games, and simple-stochastic games lie in UEopL.