A training algorithm for optimal margin classifiers

@inproceedings{Boser1992ATA,
  title={A training algorithm for optimal margin classifiers},
  author={Bernhard E. Boser and Isabelle Guyon and Vladimir Naumovich Vapnik},
  booktitle={COLT '92},
  year={1992}
}
A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to… Expand
Adaptive training methods for optimal margin classification
  • M. Lehtokangas
  • Computer Science
  • IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339)
  • 1999
TLDR
This study considers adaptive training schemes for optimal margin classification with neural networks and describes some novel schemes and compares them with the conventional schemes. Expand
Automatic Capacity Tuning of Very Large VC-Dimension Classifiers
TLDR
It is shown that even high-order polynomial classifiers in high dimensional spaces can be trained with a small amount of training data and yet generalize better than classifiers with a smaller VC-dimension. Expand
Pattern Selection for Support Vector Classifiers
TLDR
A k-nearest neighbors (k-NN) based pattern selection method that tries to select the patterns that are near the decision boundary and that are correctly labeled to reduce training time of redundant SVs. Expand
Pattern recognition with novel support vector machine learning method
  • M. Lehtokangas
  • Computer Science, Mathematics
  • 2000 10th European Signal Processing Conference
  • 2000
TLDR
This study investigates the basic SVM method and points out some problems that may arise especially in large scale problems with abundant data, and proposes a novel SVM type method that aims to avoid the problems found in the basic method. Expand
New support vector algorithms with parametric insensitive/margin model
  • Pei-Yi Hao
  • Mathematics, Medicine
  • Neural Networks
  • 2010
In this paper, a modification of v-support vector machines (v-SVM) for regression and classification is described, and the use of a parametric insensitive/margin model with an arbitrary shape isExpand
Training Data Selection for Support Vector Machines
TLDR
This paper proposes two new methods that select a subset of data for SVM training and shows that a significant amount of training data can be removed by the proposed methods without degrading the performance of the resulting SVM classifiers. Expand
Fast Pattern Selection for Support Vector Classifiers
TLDR
A k-nearest neighbors (k-NN) based pattern selection method that tries to select the patterns that are near the decision boundary and that are correctly labeled to reduce training time of redundant SVs. Expand
Perceptron-like large margin classifiers
TLDR
As the data are embedded in the augmented space at a larger distance from the origin the maximum margin in that space approaches the maximum geometric one in the original space, and the algorithmic procedure could be regarded as an approximate maximal margin classifier. Expand
Selecting Data for Fast Support Vector Machines Training
TLDR
This paper proposes two new methods that select a subset of data for SVM training and shows that a significant amount of training data can be removed by the proposed methods without degrading the performance of the resulting SVM classifiers. Expand
On the proliferation of support vectors in high dimensions
TLDR
This paper identifies new deterministic equivalences for this phenomenon of support vector proliferation, and uses them to substantially broaden the conditions under which the phenomenon occurs in high-dimensional settings, and proves a nearly matching converse result. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 41 REFERENCES
Structural Risk Minimization for Character Recognition
TLDR
The method of Structural Risk Minimization is used to control the capacity of linear classifiers and improve generalization on the problem of handwritten digit recognition. Expand
Computer aided cleaning of large databases for character recognition
TLDR
By using the method of pattern cleaning, combined with an emphasizing scheme applied on the patterns that are hard to learn, the error rate on the test set has been reduced by half, in the case of the database of handwritten lowercase characters entered on a touch terminal. Expand
Comparing different neural network architectures for classifying handwritten digits
TLDR
The authors propose a novel way of organizing the network architectures by training several small networks so as to deal separately with subsets of the problem, and then combining the results. Expand
Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks
TLDR
A theory is reported that shows the equivalence between regularization and a class of three-layer networks called regularization networks or hyper basis functions. Expand
Consistent inference of probabilities in layered networks: predictions and generalizations
The problem of learning a general input-output relation using a layered neural network is discussed in a statistical framework. By imposing the consistency condition that the error minimization beExpand
Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network
TLDR
A scheme is implemented that allows a network to learn the derivative of its outputs with respect to distortion operators of their choosing, which not only reduces the learning time and the amount of training data, but also provides a powerful language for specifying what generalizations the authors wish the network to perform. Expand
What Size Net Gives Valid Generalization?
TLDR
It is shown that if m O(W/ ∊ log N/∊) random examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 ∊/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 2 ∊ of future test examples drawn from the same distribution. Expand
Predicting {0,1}-functions on randomly drawn points
TLDR
This model is related to Valiant′s PAC learning model, but does not require the hypotheses used for prediction to be represented in any specified form and shows how to construct prediction strategies that are optimal to within a constant factor for any reasonable class F of target functions. Expand
Neural Networks and the Bias/Variance Dilemma
TLDR
It is suggested that current-generation feedforward neural networks are largely inadequate for difficult problems in machine perception and machine learning, regardless of parallel-versus-serial hardware or other implementation issues. Expand
Fast Learning in Networks of Locally-Tuned Processing Units
We propose a network architecture which uses a single internal layer of locally-tuned processing units to learn both classification tasks and real-valued function approximations (Moody and DarkenExpand
...
1
2
3
4
5
...