We present an efficient feature selection algorithm for the general regression problem, which utilizes a piecewise linear orthonormal least squares (OLS) procedure. The algorithm 1) determines an appropriate piecewise linear network (PLN) model for the given data set, 2) applies the OLS procedure to the PLN model, and 3) searches for useful feature subsets… (More)
Piecewise linear networks (PLNs) are attractive because they can be trained quickly and provide good performance in many nonlinear approximation problems. Most existing design algorithms for piecewise linear networks are not convergent, non-optimal, or are not designed to handle noisy data. In this paper, four algorithms are presented which attack this… (More)
Starting from the strict interpolation equations for multivariate polynomials, an upper bound is developed for the number of patterns that can be memorized by a nonlinear feedforward network. A straightforward proof by contradiction is presented for the upper bound. It is shown that the hidden activations do not have to be analytic. Networks, trained by… (More)
In this paper, three approaches are presented for generating and validating sequences of different size neural nets. First, a growing method is given along with several weight initialization methods, and their properties. Then a one pass pruning method is presented which utilizes orthogonal least squares. Based upon this pruning approach, a one-pass… (More)
In this paper, we model large support vector machines (SVMs) by smaller networks in order to decrease the computational cost. The key idea is to generate additional training patterns using a trained SVM and use these additional patterns along with the original training patterns to train a neural network. Results verify the validity of the technique.
In this paper, the effects of nonsingular affine transforms on various nonlinear network training algorithms are analyzed. It is shown that gradient related methods, are quite sensitive to an input affine transform, while Newton related methods are invariant. These results give a connection between pre-processing techniques and weight initialization… (More)
In this paper we propose an efficient method for forecasting highly redundant time-series based on historical information. First, redundant inputs and desired outputs are compressed and used to train a single network. Second, network output vectors are uncompressed. Our approach is successfully tested on the hourly temperature forecasting problem.