Distributed web crawling

Known as: Distributed crawling, Distributed search, Distributed web crawler 
Distributed web crawling is a distributed computing technique whereby Internet search engines employ many computers to index the Internet via web… (More)
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2010
2010
The basic requirements of the distributed Web crawling systems are: short download time, low communication overhead and balanced… (More)
  • figure 1
  • figure 4
  • figure 3
  • figure 5
  • figure 6
Is this relevant?
2009
2009
In this report we will outline the relevant background research, the design, the implementation and the evaluation of a… (More)
  • figure 2.2
  • figure 2.3
  • figure 2.4
  • figure 2.5
  • figure 2.6
Is this relevant?
2008
2008
We identify the issues that are important in design of a geographically distributed Web crawler. The identified issues are… (More)
  • figure 2
  • figure 1
  • table 1
  • figure 3
  • figure 4
Is this relevant?
2007
2007
This paper presents a multi-objective approach to Web space partitioning, aimed to improve distributed crawling efficiency. The… (More)
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
2005
2005
This paper evaluates scalable distributed crawling by means of the geographical partition of the Web. The approach is based on… (More)
  • figure 1
  • figure 3
  • figure 2
  • table 1
  • table 2
Is this relevant?
2004
2004
In this paper, we present the design and implementation of a distributed web crawler. We begin by motivating the need for such a… (More)
  • figure 1
  • figure 2
  • figure 4
  • figure 3
  • figure 5
Is this relevant?
2004
2004
Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2003
Highly Cited
2003
We report our experience in implementing UbiCrawler, a scalable distributed Web crawler, using the Java programming language. The… (More)
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2002
Highly Cited
2002
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 4.1
Is this relevant?
2001
2001
A web crawling system using a distributed architecture needs to coordinate the whole system when the nodes in the system change… (More)
  • figure 1
  • figure 2
Is this relevant?