#### Filter Results:

- Full text PDF available (125)

#### Publication Year

2000

2017

- This year (15)
- Last 5 years (61)
- Last 10 years (101)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

#### Method

Learn More

- Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, Andrew Cotter
- Math. Program.
- 2007

We describe and analyze a simple and effective iterative algorithm for solving the optimization problem cast by Support Vector Machines (SVM). Our method alternates between stochastic gradient descent steps and projection steps. We prove that the number of iterations required to obtain a solution of accuracy ε is Õ(1/ε). In contrast, previous… (More)

- Nathan Srebro, Jason D. M. Rennie, Tommi S. Jaakkola
- NIPS
- 2004

We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss generalization error bounds for them.

- Jason D. M. Rennie, Nathan Srebro
- ICML
- 2005

Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, current SDP solvers can only handle MMMF problems on matrices… (More)

- Nathan Srebro, Tommi S. Jaakkola
- ICML
- 2003

We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and efficient (EM) algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in addition, the nature of locally optimal solutions that arise in… (More)

- Yonatan Amit, Michael Fink, Nathan Srebro, Shimon Ullman
- ICML
- 2007

This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using <i>trace-norm</i> regularization and study gradient-based optimization both for the linear case… (More)

- Andreas Argyriou, Rina Foygel, Nathan Srebro
- NIPS
- 2012

We derive a novel norm that corresponds to the tightest convex relaxation of sparsity combined with an `2 penalty. We show that this new k-support norm provides a tighter relaxation than the elastic net and can thus be advantageous in in sparse prediction problems. We also bound the looseness of the elastic net, thus shedding new light on it and providing… (More)

- Maria-Florina Balcan, Avrim Blum, Nathan Srebro
- COLT
- 2008

We continue the investigation of natural conditions for a similarity function to allow learning, without requiring the similarity function to be a valid kernel, or referring to an implicit high-dimensional space. We provide a new notion of a “good similarity function” that builds upon the previous definition of Balcan and Blum (2006) but improves on it in… (More)

- Shai Shalev-Shwartz, Nathan Srebro
- ICML
- 2008

We discuss how the runtime of SVM optimization should <b>decrease</b> as the size of the training data increases. We present theoretical and empirical results demonstrating how a simple subgradient descent approach indeed displays such behavior, at least for linear kernels.

- David R. Karger, Nathan Srebro
- SODA
- 2001

Markov networks are a common class of graphical models used in machine learning. Such models use an undirected graph to capture dependency information among random variables in a joint probability distribution. Once one has chosen to use a Markov network model, one aims to choose the model that “best explains” the data that has been… (More)

- Shai Shalev-Shwartz, Nathan Srebro, Tong Zhang
- SIAM Journal on Optimization
- 2010

We study the problem of minimizing the expected loss of a linear predictor while constraining its sparsity, i.e., bounding the number of features used by the predictor. While the resulting optimization problem is generally NP-hard, several approximation algorithms are considered. We analyze the performance of these algorithms, focusing on the… (More)