Leonard J. Schulman

Learn More
We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify a <i>clusterability</i> criterion for data sets. We present variants of Lloyd's heuristic that(More)
We present a fairly general method for nding deterministic constructions obeying what we call krestrictions; this yields structures of size not much larger than the probabilistic bound. The structures constructed by our method include (n; k)-universal sets (a collection of binary vectors of length n such that for any subset of size k of the indices, all 2 k(More)
We prove that if a linear error-correcting code C:{0, 1} n →{0, 1} m is such that a bit of the message can be probabilistically reconstructed by looking at two entries of a corrupted codeword, then m = 2Ω (n). We also present several extensions of this result. We show a reduction from the complexity of one-round, information-theoretic Private Information(More)
Let the input to a computation problem be split between two processors connected by a commu nication link and let an interactive protocol be known by which on any input the processors can solve the problem using no more than T transmissions of bits between them provided the channel is noiseless in each direction We study the following question if in fact(More)
We show that, given data from a mixture of k well-separated spherical Gaussians in Rd , a simple two-round variant of EM will, with high probability, learn the parameters of the Gaussians to nearoptimal precision, if the dimension is high (d lnk). We relate this to previous theoretical and empirical work on the EM algorithm.
Sampling is an important primitive in probabilistic and quantum algorithms. In the spirit of communication complexity, given a function $f: X \times Y \rightarrow \{0,1\}$ and a probability distribution $D$ over $X \times Y$, we define the sampling complexity of $(f,D)$ as the minimum number of bits Alice and Bob must communicate for Alice to pick $x \in X$(More)
Many quantum algorithms, including Shor's celebrated factoring and discrete log algorithms, proceed by reduction to a <i>hidden subgroup problem</i>, in which a unknown subgroup <i>H</i> of a group <i>G</i> must be determined from a quantum state &psi; over <i>G</i> that is uniformly supported on a left coset of <i>H</i>. These hidden subgroup problems are(More)