## 5,915 Citations

### Combining Generalizers Using Partitions of the Learning Set

- Computer Science
- 1993

For any real-world generalization problem, there are always many generalizers which could be applied to the problem, so this chapter discusses some algorithmic techniques for dealing with this multiplicity of possible generalizers, including an extension of cross-validation called stacked generalization.

### Stacked Generalizations: When Does It Work?

- Computer ScienceIJCAI
- 1997

This paper addresses two crucial issues which have been considered to be a 'black art' in classification tasks ever since the introduction of stacked generalization by Wolpert in 1992: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input.

### Cascade Generalization

- Computer ScienceMachine Learning
- 2004

Two related methods for merging classifiers are presented, one of which outperforms other methods for combining classifiers, like Stacked Generalization, and competes well against Boosting at statistically significant confidence levels.

### On Deriving the Second-Stage Training Set for Trainable Combiners

- Computer ScienceMultiple Classifier Systems
- 2005

An extension of the stacked generalization approach is proposed which significantly improves the combiner robustness and introduces additional noise to the second-stage training dataset, and should therefore be bundled with simple combiners that are insensitive to the noise.

### Local Cascade Generalization

- Computer Science
- 1998

Local Generalization Algorithms for merging classiiers out-performs other methods for combining clas-siiers, like Stacked Generalization and competes well against Boosting, with statistically signiicant conndence levels.

### An Efficient Method To Estimate Bagging's Generalization Error

- Computer ScienceMachine Learning
- 2004

This paper presents several techniques for estimating the generalization error of a bagged learning algorithm without invoking yet more training of the underlying learning algorithm (beyond that of the bagging itself), as is required by cross-validation-based estimation.

### {29 () Cascade Generalization

- Computer Science
- 2000

Cascade also outperforms other methods for combining classiiers, like Stacked Generalization, and competes well against Boosting at statistically signiicant conndence levels.

### Stacked generalization in neural networks: generalization on statistically neutral problems

- Computer ScienceIJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222)
- 2001

It is shown that for statistically neutral problems such as parity and majority function, the stacked generalization scheme improves classification performance and generalization accuracy over the single level cross-validation model.

### Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks

- Computer ScienceIJCAI
- 1995

An approach to initializing neural networks that uses competitive learning to intelligently create networks that are originally located far from the origin of weight space, thereby potentially increasing the set of reachable local minima.

### Linear classifier combination and selection using group sparse regularization and hinge loss

- Computer SciencePattern Recognit. Lett.
- 2013

## References

SHOWING 1-10 OF 25 REFERENCES

### A Mathematical Theory of Generalization: Part II

- MathematicsComplex Syst.
- 1990

It is leads to the conclusion that (current) neural nets in fact constitute a poo r means of general izing, and other sets of criteria, more sophisticated than those criteria embo died in this first series of pap ers, are investigated.

### The Relationship Between Occam's Razor and Convergent Guessing

- Computer ScienceComplex Syst.
- 1990

This paper establishes the relationship between Occam's razor and convergent guessing, and deduces an optimal measure of the "simplicity" of an architecture's guessing beh avior, and goes on to elucidate the many advantages, both practical and theoreti cal, of using the optimal simplicity measure.

### A theory of the learnable

- Computer ScienceSTOC '84
- 1984

This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.

### Constructing a generalizer superior to NETtalk via a mathematical theory of generalization

- MathematicsNeural Networks
- 1990

### How Neural Nets Work

- Computer ScienceNIPS
- 1987

This paper demonstrates that for certain applications neural networks can achieve significantly higher numerical accuracy than more conventional techniques, and shows that prediction of future values of a chaotic time series can be performed with exceptionally high accuracy.

### Computers and the Theory of Statistics: Thinking the Unthinkable

- Mathematics
- 1979

This is a survey article concerning recent advances in certain areas of statistical theory, written for a mathematical audience with no background in statistics. The topics are chosen to illustrate a…

### MIT Progress in Understanding Images

- Computer Science
- 1982

Work on stereo to facilitate the computation of depth information and visible surface characteristics, the detction and interpretation of motion, the interpolation and description of visible surfaces, the description of two- and three-dimensional shapes, real-time convolution, and shape from shading is reviewed.

### Hierarchical training of neural networks and prediction of chaotic time series

- Computer Science
- 1991

### Methods for Solving Incorrectly Posed Problems

- Education
- 1984

A new book enPDFd methods for solving incorrectly posed problems that can be a new way to explore the knowledge and get one thing to always remember in every reading time, even step by step is shown.

### NETtalk: a parallel network that learns to read aloud

- Computer Science
- 1988

NETtalk is an alternative approach that is based on an automated learning procedure for a parallel network of deterministic processing units that achieves good performance and generalizes to novel words.