Changes in Web client access patterns: Characteristics and caching implications

Abstract

Understanding the nature of the workloads and system demands created by users of the World Wide Web is crucial to properly designing and provisioning Web services. Previous measurements of Web client workloads have been shown to exhibit a number of characteristic features; however, it is not clear how those features may be changing with time. In this study we compare two measurements of Web client workloads separated in time by three years, both captured from the same computing facility at Boston University. The older dataset, obtained in 1995, is well known in the research literature and has been the basis for a wide variety of studies. The newer dataset was captured in 1998 and is comparable in size to the older dataset. The new dataset has the drawback that the collection of users measured may no longer be representative of general Web users; however, using it has the advantage that many comparisons can be drawn more clearly than would be possible using a new, different source of measurement. Our results fall into two categories. First we compare the statistical and distributional properties of Web requests across the two datasets. This serves to reinforce and deepen our understanding of the characteristic statistical properties of Web client requests. We find that the kinds of distributions that best describe document sizes have not changed between 1995 and 1998, although specific values of the distributional parameters are different. Second, we explore the question of how the observed differences in the properties of Web client requests, particularly the popularity and temporal locality properties, affect the potential for Web file caching in the network. We find that for the computing facility represented by our traces between 1995 and 1998, (1) the benefits of using size‐based caching policies have diminished; and (2) the potential for caching requested files in the network has declined.

DOI: 10.1023/A:1019236319752

Extracted Key Phrases

3 Figures and Tables

Showing 1-10 of 31 references

A general methodology for characterizing access patterns and analyzing web server performance

  • K Arun, Edward A Iyengar, Mark S Macnair, Li Squillante, Zhang
  • 1998

Analyzing performance of partitioned caches for the World Wide Web

  • Cristina Duarte Murta, Wagner Meira Virg Lio Almeida, Jr
  • 1998

Strong regularities in World Wide Web surrng

  • Bernardo A Huberman, Peter L T Pirolli, James E Pitkow, Rajan M Lukose
  • 1998

Statistics

020406080'99'01'03'05'07'09'11'13'15'17
Citations per Year

496 Citations

Semantic Scholar estimates that this publication has received between 393 and 626 citations based on the available data.

See our FAQ for additional information.