- Full text PDF available (127)
- This year (10)
- Last five years (60)
Data Set Used
We describe and analyze a simple and effective iterative algorithm for solving the optimization problem cast by Support Vector Machines (SVM). Our method alternates between stochastic gradient descent steps and projection steps. We prove that the number of iterations required to obtain a solution of accuracy ε is Õ(1/ε). In contrast, previous… (More)
We present a novel approach to collaborative prediction, using low-norm instead of low-rank factorizations. The approach is inspired by, and has strong connections to, large-margin linear discrimination. We show how to learn low-norm factorizations by solving a semi-definite program, and discuss generalization error bounds for them.
Maximum Margin Matrix Factorization (MMMF) was recently suggested (Srebro et al., 2005) as a convex, infinite dimensional alternative to low-rank approximations and standard factor models. MMMF can be formulated as a semi-definite programming (SDP) and learned using standard SDP solvers. However, current SDP solvers can only handle MMMF problems on matrices… (More)
We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and efficient (EM) algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closed-form solution in general. We analyze, in addition , the nature of locally optimal solutions that arise… (More)
We continue the investigation of natural conditions for a similarity function to allow learning, without requiring the similarity function to be a valid kernel , or referring to an implicit high-dimensional space. We provide a new notion of a " good similarity function " that builds upon the previous definition of Balcan and Blum (2006) but improves on it… (More)
For supervised classification problems, it is well known that learnability is equivalent to uniform convergence of the empirical risks and thus to learnability by empirical minimization. Inspired by recent regret bounds for online convex optimization , we study stochastic convex optimization , and uncover a surprisingly different situation in the more… (More)
We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available , we show how to optimally adjust any learned predictor so as to remove discrimination… (More)
This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using <i>trace-norm</i> regularization and study gradient-based optimization both for the linear case… (More)
We discuss how the runtime of SVM optimization should <b>decrease</b> as the size of the training data increases. We present theoretical and empirical results demonstrating how a simple subgradient descent approach indeed displays such behavior, at least for linear kernels.
We study the rank, trace-norm and max-norm as complexity measures of matrices, focusing on the problem of fitting a matrix with matrices having low complexity. We present generalization error bounds for predicting unobserved entries that are based on these measures. We also consider the possible relations between these measures. We show gaps between them,… (More)