Martin Anthony

Learn More
The paper introduces some generalizations of Vapnik’s method of structural risk minimisation (SRM). As well as making explicit some of the details on SRM, it provides a result that allows one to trade off errors on the training sample against improved generalization performance. It then considers the more general case when the hierarchy of classes is chosen(More)
We say a function t in a set H of {0, 1}-valued functions defined on a set X is specified by S ⊆ X if the only function in H which agrees with t on S is t itself. The specification number of t is the least cardinality of such an S. For a general finite class of functions, we show that the specification number of any function in the class is at least equal(More)
Abst ract . Th e Vapn ik-Chervonenkis dimension has proven to be of great use in the theoret ical study of generalizat ion in artificial neural networks. Th e "probably approximately correct" learning framework is described and the importance of the Vapnik-Chervonenkis dimension is illustrated. We then investigate the Vapnik-Chervonenkis dimension of(More)
In this paper, we study a statistical property of classes of real-valued functions that we call approximation from interpolated examples. We derive a characterisation of function classes that have this property, in terms of their ‘fat-shattering function’, a notion that has proven useful in computational learning theory. The property is central to a problem(More)
The paper introduces a framework for studying structural risk minimisation. The model views structural risk minimisation in a PAC context. It then considers the more general case when the hierarchy of classes is chosen in response to the data. This theoretically explains the impressive performance of the maximal margin hyperplane algorithm of Vapnik. It may(More)
Some recent work [7, 14, 15] in computational learning theory has discussed learning in situations where the teacher is helpful, and can choose to present carefully chosen sequences of labelled examples to the learner. We say a function <italic>t</italic> in a set <italic>H</italic> of functions (a hypothesis space) defined on a set <italic>X</italic> is(More)
Linear threshold functions (for real and Boolean inputs) have received much attention, for they are the component parts of many artificial neural networks. Linear threshold functions are exactly those functions such that the positive and negative examples are separated by a hyperplane. One extension of this notion is to allow separators to be surfaces whose(More)