Markus Kinzler

As the World Wide Web is growing rapidly, it is getting increasingly challenging to gather representative information about it. Instead of crawling the web exhaustively one has to resort to other techniques like sampling to determine the properties of the web. A uniform random sample of the web would be useful to determine the percentage of web pages in a(More)
This diploma thesis investigated the problem of sampling URLs uniformly at random from the web. Such a method for sampling URLs can be used to estimate various properties of web pages. For example, one could estimate: • The fraction of web pages written in various languages • The coverage of various search engines • The distribution of web pages in(More)
