Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 218,055,569 papers from all fields of science
Search
Sign In
Create Free Account
Web crawler
Known as:
Webcrawler
, Crawl site
, RBSE
Expand
A Web crawler is an Internet bot which systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering). Web…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
Ajax (programming)
Apache Hadoop
Apache Nutch
Apache Storm
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Review
2013
Review
2013
Web Crawler: A Review
Md. Abu Kausar
,
V. Dhaka
,
S. K. Singh
,
Maurice de Kunder
2013
Corpus ID: 18720745
Information Retrieval deals with searching and retrieving information within the documents and it also searches the online…
Expand
Highly Cited
2010
Highly Cited
2010
A focused crawler for Dark Web forums
Tianjun Fu
,
A. Abbasi
,
Hsinchun Chen
J. Assoc. Inf. Sci. Technol.
2010
Corpus ID: 13097269
The unprecedented growth of the Internet has given rise to the Dark Web, the problematic facet of the Web associated with…
Expand
Highly Cited
2009
Highly Cited
2009
Web Crawler Architecture
Marc Najork
Encyclopedia of Database Systems
2009
Corpus ID: 29031380
Definition A web crawler is a program that, given one or more seed URLs, downloads the web pages associated with these URLs…
Expand
Highly Cited
2008
Highly Cited
2008
Investigating web services on the world wide web
Eyhab Al-Masri
,
Q. Mahmoud
The Web Conference
2008
Corpus ID: 12570844
Searching for Web service access points is no longer attached to service registries as Web search engines have become a new major…
Expand
Highly Cited
2007
Highly Cited
2007
A Measurement Study of a Large-Scale P2P IPTV System
Xiaojun Hei
,
Chao Liang
,
Jian Liang
,
Yong Liu
,
K. Ross
IEEE transactions on multimedia
2007
Corpus ID: 6166089
An emerging Internet application, IPTV, has the potential to flood Internet access and backbone ISPs with massive amounts of new…
Expand
Highly Cited
2002
Highly Cited
2002
Design and implementation of a high-performance distributed Web crawler
Vladislav Shkapenyuk
,
Torsten Suel
Proceedings / International Conference on Data…
2002
Corpus ID: 10651529
Broad Web search engines as well as many more specialized search tools rely on Web crawlers to acquire large collections of pages…
Expand
Highly Cited
2001
Highly Cited
2001
A web crawler design for data mining
M. Thelwall
Journal of information science
2001
Corpus ID: 17890086
The content of the web has increasingly become a focus for academic research. Computer programs are needed in order to conduct…
Expand
Highly Cited
2001
Highly Cited
2001
Peer-to-peer architecture case study: Gnutella network
M. Ripeanu
Proceedings First International Conference on…
2001
Corpus ID: 444337
Despite recent excitement generated by the P2P paradigm and despite surprisingly fast deployment of some P2P applications, there…
Expand
Review
2000
Review
2000
SPSS 13.0 Guide to Data Analysis
M. Norusis
2000
Corpus ID: 62743249
The SPSS 16.0 Guide to Data Analysis is a friendly introduction to both data analysis and SPSS, the worlds leading desktop…
Expand
Highly Cited
1999
Highly Cited
1999
Mercator: A scalable, extensible Web crawler
Allan Heydon
,
Marc Najork
World wide web (Bussum)
1999
Corpus ID: 207736356
This paper describes Mercator, a scalable, extensible Web crawler written entirely in Java. Scalable Web crawlers are an…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE