Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 227,741,321 papers from all fields of science
Search
Sign In
Create Free Account
Web crawler
Known as:
Webcrawler
, Crawl site
, RBSE
Expand
A Web crawler is an Internet bot which systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering). Web…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
Ajax (programming)
Apache Hadoop
Apache Nutch
Apache Storm
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2012
Highly Cited
2012
A Study of Recommending Locations on Location-Based Social Network by Collaborative Filtering
Dequan Zhou
,
Bin Wang
,
M. Rahimi
,
Xin Wang
Canadian Conference on AI
2012
Corpus ID: 5334732
The development of location-based social networking (LBSN) services is growing rapidly these days. Users of LBSN services are…
Expand
2007
2007
Semantic deep web: automatic attribute extraction from the deep web data sources
Yoo Jung An
,
J. Geller
,
Yi-Ta Wu
,
Soon Ae Chun
ACM Symposium on Applied Computing
2007
Corpus ID: 16461380
"Deep Web" refers to the rich information and data hidden in backend databases, etc., that search engines or Web crawlers cannot…
Expand
Highly Cited
2006
Highly Cited
2006
Malware prevalence in the KaZaA file-sharing network
Seungwon Shin
,
Jaeyeon Jung
,
H. Balakrishnan
ACM/SIGCOMM Internet Measurement Conference
2006
Corpus ID: 2976102
In recent years, more than 200 viruses have been reported to use a peer-to-peer (P2P) file-sharing network as a propagation…
Expand
Highly Cited
2005
Highly Cited
2005
DR-NEGOTIATE - a system for automated agent negotiation with defeasible logic-based strategies
Thomas Skylogiannis
,
G. Antoniou
,
Nick Bassiliades
,
Guido Governatori
,
Antonis Bikakis
International Conference on E-Learning, E…
2005
Corpus ID: 1077980
Highly Cited
2005
Highly Cited
2005
Constructing Interface Schemas for Search Interfaces of Web Databases
Hai He
,
W. Meng
,
Clement T. Yu
,
Zonghuan Wu
WISE
2005
Corpus ID: 10943355
Many databases have become Web-accessible through form-based search interfaces (i.e., search forms) that allow users to specify…
Expand
Highly Cited
2005
Highly Cited
2005
Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First Six Months of Data from Project Honey Pot
Matthew B. Prince
,
Benjamin M. Dahl
,
L. Holloway
,
A. M. Keller
,
Eric Langheinrich
International Conference on Email and Anti-Spam
2005
Corpus ID: 41252269
This paper summarizes and analyses data compiled on the activities of email harvesters gathered through a 5,000+ member honey pot…
Expand
Highly Cited
2005
Highly Cited
2005
What's there and what's not?: focused crawling for missing documents in digital libraries
Ziming Zhuang
,
R. Wagle
,
C. Lee Giles
ACM/IEEE Joint Conference on Digital Libraries
2005
Corpus ID: 13507220
Some large scale topical digital libraries, such as CiteSeer, harvest online academic documents by crawling open-access archives…
Expand
Highly Cited
2004
Highly Cited
2004
A functional relationship between species richness of spiders and lichens in spruce
B. Gunnarsson
,
M. Hake
,
S. Hultengren
Biodiversity and Conservation
2004
Corpus ID: 25270692
Modern forestry has created stands with even age distribution of trees and fragmentation of the habitat. In boreal forests, the…
Expand
Highly Cited
2003
Highly Cited
2003
Development of "Souryu I & II" -Connected Crawler Vehicle for Inspection of Narrow and Winding Space
T. Takayama
,
S. Hirose
J. Robotics Mechatronics
2003
Corpus ID: 30830639
Highly Cited
1979
Highly Cited
1979
Games spiders play
Susan E. Richert
Behavioral Ecology and Sociobiology
1979
Corpus ID: 24952392
Summary1.The predictions of game theory concerning the use of resource assessment strategies by animals in conflict situations…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE