Corpus ID: 16170082

Boosting Classifiers with Tightened L0-Relaxation Penalties

@inproceedings{Goldberg2010BoostingCW,
  title={Boosting Classifiers with Tightened L0-Relaxation Penalties},
  author={Noam Goldberg and Jonathan Eckstein},
  booktitle={ICML},
  year={2010}
}
We propose a novel boosting algorithm which improves on current algorithms for weighted voting classification by striking a better balance between classification accuracy and the sparsity of the weight vector. In order to justify our optimization formulations, we first consider a novel integer linear program as a model for sparse classifier selection, generalizing the minimum disagreement half-space problem whose complexity has been investigated in computational learning theory. Specifically… Expand
Sparse weighted voting classifier selection and its linear programming relaxations
TLDR
This work shows that common ''soft margin'' linear programming (LP) formulations for constructing weighted voting classsifiers are equivalent to an LP relaxation of their formulation, and proves that its optimum is at least as hard to approximate as the minimum disagreement halfspace problem for a wide range of penalty parameter values. Expand
A New Perspective on Convex Relaxations of Sparse SVM
TLDR
A convex relaxation of a sparse support vector machine (SVM) based on the perspective relaxation of mixed-integer nonlinear programs is proposed that can be efficiently solved by standard conic optimization solvers. Expand
An Improved Branch-and-Bound Method for Maximum Monomial Agreement
TLDR
An exact branch-and-bound method for maximum agreement over Boolean monomials is described, improving on the earlier work of Goldberg and Shan and developing a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Expand
IPBoost - Non-Convex Boosting via Integer Programming
TLDR
This paper explores non-convex boosting in classification by means of integer programming and demonstrates real-world practicability of the approach while circumventing shortcomings of convex boosting approaches. Expand
Multilinear Sets with Cardinality Constraints
The problem of minimizing a multilinear function of binary variables is a well-studied NP-hard problem. The set of solutions of the standard linearization of this problem is called the multilinearExpand
Cardinality Constrained Multilinear Sets
TLDR
A cardinality constrained version of this problem with upper and lower bounds on the number of nonzero variables is studied, and a complete polyhedral description of its convex hull when the multilinear terms in the problem have a nested structure. Expand
Supersparse linear integer models for optimized medical scoring systems
TLDR
This paper provides bounds on the testing and training accuracy of SLIM scoring systems, and presents a new data reduction technique that can improve scalability by eliminating a portion of the training data beforehand. Expand
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs
TLDR
An EM-based algorithm for optimization of a probabilistic model called RNNLogic, which treats logic rules as a latent variable, and simultaneously trains a rule generator as well as a reasoning predictor with logic rules. Expand
Multi-class classification using a signomial function
TLDR
A column generation method that iteratively generates good signomial terms is proposed that obtain better or comparable classification accuracies than existing methods and also provide more sparse classifiers. Expand
Combinatorial Optimization: 6th International Symposium, ISCO 2020, Montreal, QC, Canada, May 4–6, 2020, Revised Selected Papers
TLDR
This work provides the according linear relaxations, discusses their combinatorial structure, and demonstrates how the associated polyhedra can be entirely described or polyhedral arguments can be applied to find minimum such sets for special graphs. Expand

References

SHOWING 1-10 OF 28 REFERENCES
Sparse Weighted Voting Classifier Selection and its LP Relaxations
We consider a combinatorial optimization problem that generalizes the minimum disagreement halfspace problem; we seek to minimize the number of misclassifications of a weighted voting classifier,Expand
Linear Programming Boosting via Column Generation
TLDR
It is proved that for classification, minimizing the 1-norm soft margin error function directly optimizes a generalization error bound and is competitive in quality and computational cost to AdaBoost. Expand
OPTIMIZATION FOR SPARSE AND ACCURATE CLASSIFIERS
TLDR
This dissertation proposes a branch-and-bound algorithm for finding a Boolean monomial that best agrees with the given set of data, known as maximum monomial agreement. Expand
Soft Margins for AdaBoost
TLDR
It is found that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. Expand
An Improved Branch-and-Bound Method for Maximum Monomial Agreement
TLDR
An exact branch-and-bound method for maximum agreement over Boolean monomials is described, improving on the earlier work of Goldberg and Shan and developing a tighter upper bounding function and an improved branching procedure that exploits knowledge of the bound and the particular data set, while having a lower branching factor. Expand
A decision-theoretic generalization of on-line learning and an application to boosting
TLDR
The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. Expand
Boosting Optimal Logical Patterns Using Noisy Data
TLDR
This work introduces a new branch-and-bound weak learner to maximize the agreement rate of abstaining monomial hypotheses, and uses smooth boosting to linearly combine abstaining hypotheses, each of which maps a subcube of the attribute space to one of the two classes. Expand
Structural Risk Minimization Over Data-Dependent Hierarchies
TLDR
A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets. Expand
Classification on proximity data with LP-machines
We provide a new linear program to deal with classification of data in the case of data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces withExpand
A Compression Approach to Support Vector Model Selection
TLDR
Inspired by several generalization bounds, "compression coefficients" for SVMs are constructed which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane and can fairly accurately predict the parameters for which the test error is minimized. Expand
...
1
2
3
...