#### Filter Results:

- Full text PDF available (43)

#### Publication Year

2001

2017

- This year (2)
- Last 5 years (4)
- Last 10 years (29)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Martin Zinkevich
- ICML
- 2003

Convex programming involves a convex set F ⊆ R and a convex cost function c : F → R. The goal of convex programming is to find a point in F which minimizes c. In online convex programming, the convex set is known in advance, but in each step of some repeated optimization problem, one must select a point in F before seeing the cost function for that step.… (More)

- Martin Zinkevich, Markus Weimer, Alexander J. Smola, Lihong Li
- NIPS
- 2010

<lb>With the increase in available data parallel machine learning has become an in-<lb>creasingly pressing problem. In this paper we present the first parallel stochastic<lb>gradient descent algorithm including a detailed analysis and experimental evi-<lb>dence. Unlike prior work on parallel optimization algorithms [5, 7] our variant<lb>comes with parallel… (More)

Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior.… (More)

Promising approaches to structured learning problems have recently been developed in the maximum margin framework. Unfortunately, algorithms that are computationally and memory efficient enough to solve large scale problems have lagged behind. We propose using simple subgradient-based techniques for optimizing a regularized risk formulation of these… (More)

- Martin Zinkevich, Alexander J. Smola, John Langford
- NIPS
- 2009

Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well,… (More)

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific… (More)

Adaptation to other initially unknown agents often requires computing an effective counter-strategy. In the Bayesian paradigm, one must find a good counterstrategy to the inferred posterior of the other agents’ behavior. In the experts paradigm, one may want to choose experts that are good counter-strategies to the other agents’ expected behavior. In this… (More)

- Avrim Blum, Tuomas Sandholm, Martin Zinkevich
- SODA
- 2002

In this paper we study the problem of online market clearing where there is one commodity in the market, being bought and sold by multiple buyers and sellers who submit buy and sell bids that arrive and expire at different times. The auctioneer is faced with an online clearing problem of deciding which buy and sell bids to match without knowing what bids… (More)

Maximum margin structured learning (MMSL) has recently gained recognition within the machine learning community as a tractable method for large scale learning. However, most current methods are limited in terms of scalability, convergence, or memory requirements. The original Structured SMO method proposed in (Taskar et al., 2003) is slow to converge,… (More)

- Martin Zinkevich, Tucker R. Balch
- ICML
- 2001

This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in single agent systems as well as multiagent systems and multirobot systems. We prove that if an MDP possesses a symmetry, then the optimal value function andQ function are similarly… (More)