1 INTRODUCTION We investigate the tradeoff between labeled The classical problem of learning a classification rule and unlabeled sample complexities in learning can be stated as follows: patterns from classes " 1 " and a classification rule for a parametric two-class " 2 " (or " states of nature ") appear with probabilities problem. In the problem… (More)
This paper concerns learning binary-valued functions defined on IR, and investigates how a particular type of 'regularity' of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion similar to that of sample margin for real-valued functions). This motivates learning algorithms… (More)
1 Introduction One of the main problems in machine learning and statistical inference is selecting an appropriate model by which a set of data can be explained. In the absense of any structured prior information aa to the data generating mechanism, one is often forced to consider a range of models, attempting to select the model which best explains the… (More)
Instead of static entropy we assert that the Kolmogorov complexity of a static structure such as a solid is the proper measure of disorder (or chaoticity). A static structure in a surrounding perfectly-random universe acts as an interfering entity which introduces local disruption in randomness. This is modeled by a selection rule R which selects a… (More)
—The classical theory of pattern recognition assumes labeled examples appear according to unknown underlying class conditional probability distributions where the pattern classes are picked randomly in a passive manner according to their a priori probabilities. This paper presents experimental results for an incremental nearest-neighbor learning algorithm… (More)
In this paper we present a new type of binary classifier defined on the unit cube. This classifier combines some of the aspects of the standard methods that have been used in the logical analysis of data (LAD) and geometric classifiers, with a nearest-neighbor paradigm. We assess the predictive performance of the new classifier in learning from a sample,… (More)
Techniques for the logical analysis of binary data have successfully been applied to non-binary data which has been 'binarized' by means of cutpoints; see [8, 9]. In this paper, we analyse the predictive performance of such techniques and, in particular, we derive generalization error bounds that depend on how 'robust' the cutpoints are.
Shannon's theory of information stands on a probabilistic representation of events that convey information, e.g., sending messages over a communication channel. Kolmogorov argues that information is a more fundamental concept which exists also in problems with no underlying stochastic model, for instance, the information contained in an algorithm or in the… (More)