The paper introduces some generalizations of Vapnik's method of structural risk min-imisation (SRM). As well as making explicit some of the details on SRM, it provides a result that allows one to trade off errors on the training sample against improved generalization performance. It then considers the more general case when the hierarchy of classes is… (More)
The paper introduces a framework for studying structural risk minimisation. The model views structural risk minimisation in a PAC context. It then considers the more general case when the hierarchy of classes is chosen in response to the data. This theoretically explains the impressive performance of the maximal margin hyperplane algorithm of Vapnik. It may… (More)
A new proof of a result due to Vapnik is given. Its implications for the theory of PAC learnability are discussed, with particular reference to the learnability of functions taking values in a countable set. An application to the theory of artificial neural networks is then given.
In this paper we consider the generalization accuracy of classification methods based on the iterative use of linear classifiers. The resulting classifiers, which we call threshold decision lists act as follows. Some points of the data set to be classified are given a particular classification according to a linear threshold function (or hyperplane). These… (More)
Linear threshold functions (for real and Boolean inputs) have received much attention, for they are the component parts of many artificial neural networks. Linear threshold functions are exactly those functions such that the positive and negative examples are separated by a hyperplane. One extension of this notion is to allow separators to be surfaces whose… (More)
In this paper, we study a statistical property of classes of real-valued functions that we call approximation from interpolated examples. We derive a characterization of function classes that have this property, in terms of their`fat-shattering function', a notion that has proven useful in computational learning theory. We discuss the implications for… (More)
A proof that a concept is learnable provided the Vapnik-Chervonenkis dimension is finite is given. The proof is more explicit than previous proofs and introduces two new parameters which allow bounds on the sample size obtained to be improved by a factor of approximately 4log 2 (e).
Some recent work [7, 14, 15] in computational learning theory has discussed learning in situations where the teacher is helpful, and can choose to present carefully chosen sequences of labelled examples to the learner. We say a function <italic>t</italic> in a set <italic>H</italic> of functions (a hypothesis space) defined on a set <italic>X</italic> is… (More)