Rocco A. Servedio

Learn More
We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is given access to labeled examples drawn from a distribution, without restriction on the labels (e.g. adversarial noise). The algorithm constructs a hypothesis whose(More)
We show that for low-density parity-check (LDPC) codes whose Tanner graphs have sufficient expansion, the linear programming (LP) decoder of Feldman, Karger, and Wainwright can correct a constant fraction of errors. A random graph will have sufficient expansion with high probability, and recent work shows that such graphs can be constructed efficiently. A(More)
A broad class of boosting algorithms can be interpreted as performing coordinate-wise gradient descent to minimize some potential function of the margins of a data set. This class includes AdaBoost, LogitBoost, and other widely used and well-studied boosters. In this paper we show that for a broad class of convex potential functions, any such boosting(More)
We describe a new boosting algorithm which generates only smooth distributions which do not assign too much weight to any single example. We show that this new boosting algorithm can be used to construct efficient PAC learning algorithms which tolerate relatively high rates of malicious noise. In particular, we use the new smooth boosting algorithm to(More)
We give new upper and lower bounds on the degree of real multivariate polynomials which sign-represent Boolean functions. Our upper bounds for Boolean formulas yield the first known subexponential time learning algorithms for formulas of <i>superconstant</i> depth. Our lower bounds for constant-depth circuits and intersections of halfspaces are the first(More)
We consider the problem of using a multicast network code to transmit information securely in the presence of a “wire-tap” adversary who can eavesdrop on a bounded number of network edges. Cai & Yeung (ISIT, 2002) gave a method to alter any given linear network code into a new code that is secure. However, their construction is in general inefficient, and(More)
We consider a fundamental problem in computational learning theory: learning an arbitrary Boolean function that depends on an unknown set of k out of n Boolean variables. We give an algorithm for learning such functions from uniform random examples that runs in time roughly ðnÞ o oþ1; where oo2:376 is the matrix multiplication exponent. We thus obtain the(More)
We propose and analyze a new vantage point for the learn-<lb>ing of mixtures of Gaussians: namely, the PAC-style model of learning<lb>probability distributions introduced by Kearns et al. [12]. Here the task<lb>is to construct a hypothesis mixture of Gaussians that is statistically in-<lb>distinguishable from the actual mixture generating the data;(More)
Using techniques from learning theory, we show that any <italic>s</italic>-term DNF over <italic>n</italic> variables can be computed by a polynomial threshold function of degree <italic>O(n^{1/3} \log s)</italic>. This upper bound matches, up to a logarithmic factor, the longstanding lower bound given by Minsky and Papert in their 1968 book {\em(More)
We study online learning in Boolean domains using kernels which capture feature expansions equivalent to using conjunctions over basic features. We demonstrate a tradeoff between the computational efficiency with which these kernels can be computed and the generalization ability of the resulting classifier. We first describe several kernel functions which(More)