- Full text PDF available (54)
- This year (0)
- Last 5 years (19)
- Last 10 years (42)
1 INTRODUCTION We investigate the tradeoff between labeled The classical problem of learning a classification rule and unlabeled sample complexities in learning can be stated as follows: patterns from classes " 1 " and a classification rule for a parametric two-class " 2 " (or " states of nature ") appear with probabilities problem. In the problem… (More)
1 Introduction One of the main problems in machine learning and statistical inference is selecting an appropriate model by which a set of data can be explained. In the absense of any structured prior information aa to the data generating mechanism, one is often forced to consider a range of models, attempting to select the model which best explains the… (More)
This paper concerns learning binary-valued functions defined on IR, and investigates how a particular type of 'regularity' of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion similar to that of sample margin for real-valued functions). This motivates learning algorithms… (More)
Instead of static entropy we assert that the Kolmogorov complexity of a static structure such as a solid is the proper measure of disorder (or chaoticity). A static structure in a surrounding perfectly-random universe acts as an interfering entity which introduces local disruption in randomness. This is modeled by a selection rule R which selects a… (More)
—The classical theory of pattern recognition assumes labeled examples appear according to unknown underlying class conditional probability distributions where the pattern classes are picked randomly in a passive manner according to their a priori probabilities. This paper presents experimental results for an incremental nearest-neighbor learning algorithm… (More)
Using the saddle-point method an estimate is computed for the number w m,N (n) of ordered m-partitions (compositions) of a positive integer n under a constraint that the size of every part is at most N. The approximation error rate is O(n −1/5).
Techniques for the logical analysis of binary data have successfully been applied to non-binary data which has been 'binarized' by means of cutpoints; see [8, 9]. In this paper, we analyse the predictive performance of such techniques and, in particular, we derive generalization error bounds that depend on how 'robust' the cutpoints are.
Shannon's theory of information stands on a probabilistic representation of events that convey information, e.g., sending messages over a communication channel. Kolmogorov argues that information is a more fundamental concept which exists also in problems with no underlying stochastic model, for instance, the information contained in an algorithm or in the… (More)
Kolmogorov introduced a combinatorial measure of the information I(x : y) about the unknown value of a variable y conveyed by an input variable x taking a given value x. The paper extends this definition of information to a more general setting where 'x = x' may provide a vaguer description of the possible value of y.