Do TREC web collections look like the web?

  title={Do TREC web collections look like the web?},
  author={Ian Soboroff},
  journal={SIGIR Forum},
We measure the WT10g test collection, used in the TREC-9 and TREC 2001 Web Tracks, and the .GOV test collection used in the TREC 2002 Web and Interactive Tracks, with common measures used in the web topology community, in order to see if these collections "look like" the web. This is not an idle question; characteristics of the web, such as power law relationships, diameter, and connected components have all been observed within the scope of general web crawls, constructed by blindly following… CONTINUE READING
Highly Cited
This paper has 32 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications referenced by this paper.

Similar Papers

Loading similar papers…