Smooth Histograms for Sliding Windows

@article{Braverman2007SmoothHF,
  title={Smooth Histograms for Sliding Windows},
  author={Vladimir Braverman and Rafail Ostrovsky},
  journal={48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)},
  year={2007},
  pages={283-293}
}
  • V. Braverman, R. Ostrovsky
  • Published 21 October 2007
  • Computer Science
  • 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07)
In the streaming model elements arrive sequentially and can be observed only once. Maintaining statistics and aggregates is an important and non-trivial task in the model. This becomes even more challenging in the sliding windows model, where statistics must be maintained only over the most recent n elements. In their pioneering paper, Datar, Gionis, Indyk and Motwani [15] presented exponential histograms, an effective method for estimating statistics on sliding windows. In this paper we… 
Almost-Smooth Histograms and Sliding-Window Graph Algorithms
TLDR
The smooth-histogram framework of Braverman and Ostrovsky (FOCS 2007) is extended to almost-smooth functions, which includes all subadditive functions, and it is shown that if a sub additive function can be $(1+\epsilon)-approximated in the insertion-only streaming model, then it can be$.
A Unified Approach for Clustering Problems on Sliding Windows
TLDR
A data structure that extends smooth histograms as introduced by Braverman and Ostrovsky to operate on a broader class of functions is introduced, and it is shown that using only polylogarithmic space the authors can maintain a summary of the current window from which they can construct an O(1)-approximate clustering solution.
Numerical Linear Algebra in the Sliding Window Model
TLDR
This work gives a deterministic algorithm that achieves spectral approximation in the sliding window model that can be viewed as a generalization of smooth histograms, using the Loewner ordering of PSD matrices, and gives algorithms for both spectral approximation and low-rank approximation that are space-optimal up to polylogarithmic factors.
Improved Sliding Window Algorithms for Clustering and Coverage via Bucketing-Based Sketches
TLDR
This work proposes a new algorithmic framework for designing efficient sliding window algorithms via bucketing-based sketches and develops space-efficient slidingwindow algorithms for k-cover, k-clustering and diversity maximization problems.
Symmetric Norm Estimation and Regression on Sliding Windows
TLDR
This work observes that the symmetric norm streaming algorithm of Braverman et al. (STOC 2017) can be reduced to identifying and approximating the frequency of heavy-hitters in a number of substreams, and introduces a heavy-hitter algorithm that gives a (1 + )-approximation to each of the reported frequencies in the sliding window model.
k-Center Clustering with Outliers in Sliding Windows
TLDR
This work provides efficient algorithms for metric k-center clustering in the streaming model under the sliding window setting and shows, as a by-product, how to estimate the effective diameter of the window W, which is a measure of the spread of thewindow points, disregarding a given fraction of noisy distances.
Sliding Window Algorithms for k-Clustering Problems
TLDR
This work provides simple and practical algorithms that update the solution efficiently with each arrival rather than recomputing it from scratch, and finds solutions with costs only slightly higher than those returned by algorithms with access to the full dataset.
Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows
TLDR
The composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and $\ell_p$-heavy hitters that are nearly optimal in both $n$ and $\epsilon$.
Dynamic Graphs in the Sliding-Window Model
TLDR
An extensive set of positive results including algorithms for constructing basic graph synopses like combinatorial sparsifiers and spanners as well as approximating classic graph properties such as the size of a graph matching or minimum spanning tree are presented.
Submodular Maximization over Sliding Windows
TLDR
The first algorithms in the sliding window model for maximizing a monotone/non-monotone submodular function under cardinality and matroid constraints are obtained.
...
...

References

SHOWING 1-10 OF 32 REFERENCES
Maintaining Stream Statistics over Sliding Windows
TLDR
The problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far, is considered, and it is shown that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits of memory, the number of 1's can be estimated to within a factor of $1 + \ep silon$.
Maintaining significant stream statistics over sliding windows
TLDR
It is proved that any data structure for the Significant One Counting problem must use at least Ω(1/ε log<sup>2</sup> 1/θ + log ε θ<i>n</i>) bits of memory.
Approximate counts and quantiles over sliding windows
TLDR
This work considers the problem of maintaining ε-approximate counts and quantiles over a stream sliding window using limited space and presents various deterministic and randomized algorithms for approximate counts andquantiles that require O(1/ε polylog( 1/ε, N)) space.
Maintaining variance and k-medians over data stream windows
TLDR
A novel technique is presented for solving two important and related problems in the sliding window model---maintaining variance and maintaining a <i>k</i>--median clustering and a constant-factor approximation algorithm is presented.
Geometric Optimization Problems over Sliding Windows
TLDR
A simple algorithm that only needs to store $O(\frac{1}{\epsilon}{\rm log}R)$ points at any time is given, which is optimal and improves Feigenbaum, Kannan, and Zhang's recent solution by two logarithmic factors.
Sampling from a moving window over streaming data
TLDR
This work introduces the problem of sampling from a moving window of recent items from a data stream and develops two algorithms, the first of which, "chain-sample", extends reservoir sampling to deal with the expiration of data elements from the sample and the second, "priority- sample", works even when the number of elements in the window can vary dynamically over time.
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
TLDR
This paper gives a simple scheme for identifying ε-approximate frequent items over a sliding window of size <i>n</i>, and extends the scheme for variable-size window.
Succinct Sampling on Streams
TLDR
This paper shows that Succinct Sampling on Streams-algorithms are possible for {\em all} variants of the problem mentioned above, i.e. both with and without replacement and both for one-at-a-time and bursty arrival models.
Estimating Rarity and Similarity over Data Stream Windows
In the windowed data stream model, we observe items coming in over time. At any time t, we consider the window of the last N observations at-(N - 1), at-(N - 2), . . . , at, each ai ? {1, . . . , u};
Estimating Frequency Moments of Data Streams Using Random Linear Combinations
TLDR
This paper presents an algorithm for estimating F k for k > 2, over general update streams whose space complexity is \(\tilde{O}(n^{1-\frac{1}{k-1}})\) and time complexity of processing each stream update is \(tilde(1)).
...
...