Characteristics of WWW Client-based Traces
The explosion of WWW traac necessitates an accurate picture of WWW use, and in particular requires a good understanding of client requests for WWW documents. To a d d r ess this need, we have collected t r aces of actual executions of NCSA Mosaic, reeecting over half a million user requests for WWW documents. In this paper we describe the methods we used t o c ollect our traces, and the formats of the collected data. Next, we present a descriptive statistical summary of the traces we collected, which identiies a number of trends and reference p atterns in WWW use. In particular, we show that many characteristics of WWW use can be m o delled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, the distribution of user requests for documents, and the number of references to documents as a function of their overall rank in popularity (Zipf's law). Finally, we show how the power-law distributions derived f r om our traces can be u s e d to guide system designers interested i n c aching WWW documents.