Web crawler

Known as: Webcrawler, Crawl site, RBSE

A Web crawler is an Internet bot which systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering). Web…

Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.

Highly Cited

2012

Highly Cited

2012

A Study of Recommending Locations on Location-Based Social Network by Collaborative Filtering

Dequan ZhouBin WangM. RahimiXin Wang
Canadian Conference on AI
2012
Corpus ID: 5334732

The development of location-based social networking (LBSN) services is growing rapidly these days. Users of LBSN services are…

2007

Semantic deep web: automatic attribute extraction from the deep web data sources

Yoo Jung AnJ. GellerYi-Ta WuSoon Ae Chun
ACM Symposium on Applied Computing
2007
Corpus ID: 16461380

"Deep Web" refers to the rich information and data hidden in backend databases, etc., that search engines or Web crawlers cannot…

Highly Cited

2006

Highly Cited

2006

Malware prevalence in the KaZaA file-sharing network

Seungwon ShinJaeyeon JungH. Balakrishnan
ACM/SIGCOMM Internet Measurement Conference
2006
Corpus ID: 2976102

In recent years, more than 200 viruses have been reported to use a peer-to-peer (P2P) file-sharing network as a propagation…

Highly Cited

2005

Highly Cited

2005

DR-NEGOTIATE - a system for automated agent negotiation with defeasible logic-based strategies

Thomas SkylogiannisG. AntoniouNick BassiliadesGuido GovernatoriAntonis Bikakis
International Conference on E-Learning, E…
2005
Corpus ID: 1077980

Highly Cited

2005

Highly Cited

2005

Constructing Interface Schemas for Search Interfaces of Web Databases

Many databases have become Web-accessible through form-based search interfaces (i.e., search forms) that allow users to specify…

Highly Cited

2005

Highly Cited

2005

What's there and what's not?: focused crawling for missing documents in digital libraries

Ziming ZhuangR. WagleC. Lee Giles
ACM/IEEE Joint Conference on Digital Libraries
2005
Corpus ID: 13507220

Some large scale topical digital libraries, such as CiteSeer, harvest online academic documents by crawling open-access archives…

Highly Cited

2005

Highly Cited

2005

Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First Six Months of Data from Project Honey Pot

Matthew B. PrinceBenjamin M. DahlL. HollowayA. M. KellerEric Langheinrich
International Conference on Email and Anti-Spam
2005
Corpus ID: 41252269

This paper summarizes and analyses data compiled on the activities of email harvesters gathered through a 5,000+ member honey pot…

Highly Cited

2004

Highly Cited

2004

A functional relationship between species richness of spiders and lichens in spruce

Modern forestry has created stands with even age distribution of trees and fragmentation of the habitat. In boreal forests, the…

1999

CoBWeb-a crawler for the Brazilian Web

A. D. SilvaEveline VelosoP. B. GolgherB. Ribeiro-NetoAlberto H. F. LaenderN. Ziviani
6th International Symposium on String Processing…
1999
Corpus ID: 6065538

One of the key components of current Web search engines is the document collector. The paper describes CoBWeb, an automatic…

Highly Cited

1979

Highly Cited

1979

Games spiders play

Summary1.The predictions of game theory concerning the use of resource assessment strategies by animals in conflict situations…

Web crawler

Related topics

Papers overview