Algorithms column: sublinear time algorithms

@article{Kumar2003AlgorithmsCS,
  title={Algorithms column: sublinear time algorithms},
  author={Ravi Kumar and Ronitt Rubinfeld},
  journal={SIGACT News},
  year={2003},
  volume={34},
  pages={57-67}
}
With the recent tremendous increase in computational power and cheap storage, we are blessed with a multitude of available, and possibly useful, information. It is always nice to have something for (almost) nothing. However, this blessing is also something of a curse, for we may also be asked to do something meaningful with all of this data. The scale of these data sets, coupled with the typical situation in which there is very little time to perform our computations, raises the question of… 

Something for (Almost) Nothing: New Advances in Sublinear-Time Algorithms

TLDR
This chapter focuses on a formalization of approximate solutions that has been widely studied in an area of theoretical computer science known as property testing, and a close connection between property testing and the general parameter estimation.

Separating sublinear time computations by approximate diameter

TLDR
It is shown that, for any parameter r∈(0,1), the bounded error randomized sublinear time computation in time O(nr) cannot be simulated by any zero error randomizedSublinear time algorithm in o(n) time or queries; and the same is true forzero error randomized computation versus deterministic computation.

Algebraic Property Testing

TLDR
It is shown that sparsity and invariance under the affine group of permutations are sufficient conditions for a notion of very structured testing, and a new characterization of the extensively studied BCH codes is revealed.

Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms

TLDR
An algorithm is developed and analyzed which uses a novel sampling method to obtain improved bounds for approximating the Max-Cut of a graph and it is shown that by judicious choice of sampling probabilities one can obtain error bounds that are superior to the ones obtained by uniform sampling.

Separating Sublinear Time Computations by Approximate Diameter

TLDR
A class of separations about the sublinear time computations are obtained using various versions of the approximate diameter problem based on the restriction about the format of input data.

Sampling subproblems of heterogeneous Max‐Cut problems and approximation algorithms

TLDR
An algorithm is developed and analyzed which uses a novel sampling method to obtain improved bounds for approximating the Max‐Cut of a graph and it is shown that by judicious choice of sampling probabilities one can obtain error bounds that are superior to the ones obtained by uniform sampling.

Removing the Haystack to Find the Needle(s): Minesweeper, an adaptive join algorithm

TLDR
A new algorithm is described, Minesweeper, that is able to satisfy stronger runtime guarantees than previous join algorithms (colloquially, ‘beyond worst-case guarantees’) for data in indexed search trees and a dichotomy theorem is developed for the certificate-based notion of complexity.

Sublinear-time approximation algorithms for clustering via random sampling

TLDR
Using a novel analysis of a random sampling approach for four clustering problems in metric spaces, this work obtains the first time approximation algorithms that have running time independent of the input size, and depending on k and the diameter of the metric space only.

Beyond worst-case analysis for joins with minesweeper

TLDR
A new algorithm is described, Minesweeper, that is able to satisfy stronger runtime guarantees than previous join algorithms (colloquially ``beyond worst-case'' guarantees) for data in indexed search trees and a dichotomy theorem is developed for the certificate-based notion of complexity.

On-line approximate string matching with bounded errors

References

SHOWING 1-10 OF 30 REFERENCES

Quick Approximation to Matrices and Applications

TLDR
The matrix approximation is generalized to multi-dimensional arrays and from that derive approximation algorithms for all dense Max-SNP problems and the Regularity Lemma is derived.

Random sampling and approximation of MAX-CSP problems

TLDR
A new efficient sampling method for approximating r-dimensional Maximum Constraint Satisfaction Problems, MAX-rCSP, on n variables up to an additive error εnr, which gives for the first time a polynomial in ε—1 bound on the sample size necessary to carry out the above approximation.

Better streaming algorithms for clustering problems

TLDR
A randomized algorithm for the k--Median problem which produces a constant factor approximation in one pass using storage space O(k poly log n) and gives bicriterion guarantees, producing constant factor approximations by increasing the allowed fraction of outliers slightly.

Monotonicity testing over general poset domains

TLDR
It is shown that in its most general setting, testing that Boolean functions are close to monotone is equivalent, with respect to the number of required queries, to several other testing problems in logic and graph theory.

On the strength of comparisons in property testing

ON CONVERGENCE OF STOCHASTIC PROCESSES

It is clear that for given I,un } and t, the better theorem of this kind would be the one in which (2) is proved for the larger class of functions f. In this paper we shall show that certain known

Improved Testing Algorithms for Monotonicity

TLDR
Improved algorithms for testing monotonicity of functions are presented, given the ability to query an unknown function f: Σ n ↦ Ξ, and the test always accepts a monotone f, and rejects f with high probability if it is e-far from being monotones.

Polynomial time approximation schemes for dense instances of NP-hard problems

We present a unified framework for designing polynomial time approximation schemes (PTASs) for “dense” instances of many NP-hard optimization problems, including maximum cut, graph bisection, graph

Combinatorial property testing (a survey)

  • Oded Goldreich
  • Mathematics
    Randomization Methods in Algorithm Design
  • 1997
TLDR
This work considers the question of determining whether a given object has a predetermined property or is \far" from any object having the property, and focuses on combinatorial properties, and speciically on graph properties.

Property Testing in Bounded Degree Graphs

TLDR
This work develops the study of testing graph properties as initiated by Goldreich, Goldwasser and Ron and presents randomized algorithms for testing whether an unknown bounded-degree graph is connected, k -connected (for k>1 ), cycle-free and Eulerian.