#### Filter Results:

- Full text PDF available (43)

#### Publication Year

2001

2017

- This year (2)
- Last 5 years (4)
- Last 10 years (29)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Martin Zinkevich
- ICML
- 2003

Convex programming involves a convex set F ⊆ R n and a convex cost function c : F → R. The goal of convex programming is to find a point in F which minimizes c. In online convex programming, the convex set is known in advance, but in each step of some repeated optimization problem, one must select a point in F before seeing the cost function for that step.… (More)

Imitation learning of sequential, goal-directed behavior by standard supervised techniques is often difficult. We frame learning such behaviors as a maximum margin structured prediction problem over a space of policies. In this approach, we learn mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior.… (More)

- Martin Zinkevich, Markus Weimer, Alexander J. Smola, Lihong Li
- NIPS
- 2010

With the increase in available data parallel machine learning has become an increasingly pressing problem. In this paper we present the first parallel stochastic gradient descent algorithm including a detailed analysis and experimental evidence. Unlike prior work on parallel optimization algorithms [5, 7] our variant comes with parallel acceleration… (More)

Promising approaches to structured learning problems have recently been developed in the maximum margin framework. Unfortunately, algorithms that are computationally and memory efficient enough to solve large scale problems have lagged behind. We propose using simple subgradient-based techniques for optimizing a regularized risk formulation of these… (More)

- Martin Zinkevich, Alexander J. Smola, John Langford
- NIPS
- 2009

Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well,… (More)

- Martin Zinkevich, Michael H. Bowling, Neil Burch
- AAAI
- 2007

In normal scenarios, computer scientists often consider the number of states in a game to capture the difficulty of learning an equilibrium. However, players do not see games in the same light: most consider Go or Chess to be more complex than Monopoly. In this paper, we discuss a new measure of game complexity that links existing state-of-the-art… (More)

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific… (More)

- Sandra Zilles, Steffen Lange, Robert C. Holte, Martin Zinkevich
- Journal of Machine Learning Research
- 2011

While most supervised machine learning models assume that training examples are sampled at random or adversarially, this article is concerned with models of learning from a cooperative teacher that selects " helpful " training examples. The number of training examples a learner needs for identifying a concept in a given class C of possible target concepts… (More)

Perfect recall is the common and natural assumption that an agent never forgets. As a consequence, the agent can always condition its choice of action on any prior observations. In this paper, we explore relaxing this assumption. We observe the negative impact this relaxation has on algorithms: some algorithms are no longer well-defined, while others lose… (More)