On the intrinsic locality properties of Web reference streams

@article{Fonseca2003OnTI,
  title={On the intrinsic locality properties of Web reference streams},
  author={Rodrigo Fonseca and Virg{\'i}lio A. F. Almeida and Mark Crovella and Bruno D. Abrahao},
  journal={IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428)},
  year={2003},
  volume={1},
  pages={448-458 vol.1}
}
There has been considerable work done in the study of Web reference streams: sequences of requests for Web objects. In particular, many studies have looked at the locality properties of such streams, because of the impact of locality on the design and performance of caching and prefetching systems. However, a general framework for understanding why reference streams exhibit given locality properties has not yet emerged. In this paper we take a first step in this direction. We propose a… 
Comparing Strength of Locality of Reference: Popularity, Temporal Correlations, and Some Folk Theorems for the Miss Rates and Outputs of Caches
TLDR
This work focuses on two "folk theorems" that is, the stronger the locality of reference, the smaller the miss rate of the cache; and Good caching is expected to produce an output stream of requests exhibiting less locality ofreference than the input stream of request.
Locality Characteristics of Web Streams Revisited
TLDR
The simulation results show which caching policies are adept at exploiting locality characteristics of streams that result from aggregation of filtered streams and illustrate the locality properties of the resulting filtered streams.
Locality of Reference in an Hierarchy of Web Caches
TLDR
This work presents an extensive evaluation of the request filtering in hierarchy of proxy caches using the recently proposed ADF model as well as entropy as metric for Web traffic characterization and proposes the use of average entropy for comparing the locality of reference of different streams.
Temporal locality in today's content caching: why it matters and how to model it
TLDR
This paper proposes a new parsimonious traffic model, named the Shot Noise Model (SNM), that enables users to natively capture the dynamics of content popularity, whilst still being sufficiently simple to be employed effectively for both analytical and scalable simulative studies of caching systems.
PERFORMANCE EVALUATION OF PAGE REMOVAL POLICIES
TLDR
Effectiveness of LFU-K replacement policy for the purposes of caching on proxy servers is analyzed and the results of traces analysis taken from real proxy servers are given to reveal a set of properties of network traffic.
A General, Tractable and Accurate Model for a Cascade of Caches
TLDR
A simple but accurate approximate analysis for caches fed by general "renewal" traffic patterns and the ability to handle traffic patterns beyond the traditional independent reference model, thus permitting simple assessment of cascade of caches as well as improved understanding of the phenomena involved in cache hierarchies.
Exploiting Stream Request Locality to Improve Query Throughput of a Data Integration System
This paper focuses on the problem of improving throughput of distributed query processing in an RDBMS-based data integration system. Although a buffer pool can be used in an RDBMS to cache disk pages
Time-Domain Analysis of Web Cache Filter Effects ( Extended Version )
TLDR
The simulation results show that a Web cache reduces both the peak and the mean request arrival rate for Web traffic workloads, while the variance-to-mean ratio of the filtered traffic typically increases, depending on the input arrival process and the configuration of the cache.
...
...

References

SHOWING 1-10 OF 46 REFERENCES
Characterizing reference locality in the WWW
TLDR
The authors propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers and show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and proposed models for typical distributions and compare their cache performance to the traces.
Characterizing temporal locality and its impact on web server performance
  • L. Cherkasova, G. Ciardo
  • Computer Science
    Proceedings Ninth International Conference on Computer Communications and Networks (Cat.No.00EX440)
  • 2000
TLDR
A new measure of temporal locality, the scaled stack distance, is introduced, which is insensitive to popularity and captures instead the impact of short-term correlation, and is used to parameterize a synthetic trace generator.
Sources and characteristics of Web temporal locality
  • Shudong Jin, Azer Bestavros
  • Computer Science
    Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728)
  • 2000
TLDR
This work proposes a new and robust metric that enables accurate characterization of the locality of reference in a number of representative proxy cache traces and shows that there are measurable differences between the degrees (and sources) of temporal locality across these traces.
Temporal locality and its impact on Web proxy cache performance
Changes in Web client access patterns: Characteristics and caching implications
TLDR
This study compares two measurements of Web client workloads separated in time by three years, both captured from the same computing facility at Boston University and finds that for the computing facility represented by traces between 1995 and 1998, the benefits of using size‐based caching policies have diminished and the potential for caching requested files in the network has declined.
Web caching and Zipf-like distributions: evidence and implications
  • L. Breslau, P. Cao, Li Fan, Graham Phillips, S. Shenker
  • Computer Science
    IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320)
  • 1999
TLDR
This paper investigates the page request distribution seen by Web proxy caches using traces from a variety of sources and considers a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesse observed by proxies.
Characteristics of WWW Client-based Traces
TLDR
This paper presents a descriptive statistical summary of the traces of actual executions of NCSA Mosaic, and shows that many characteristics of WWW use can be modelled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, and the Distribution of user requests for documents.
On filter effects in web caching hierarchies
TLDR
The simulation results demonstrate that size-based partitioning and heterogeneous cache replacement policies each offer improvements in overall caching performance, and considers novel cache management techniques that can better exploit the changing workload characteristics across a multilevel Web proxy caching hierarchy.
The Trickle-Down Effect: Web Caching and Server Request Distribution
...
...