Skip to search formSkip to main contentSkip to account menu

Robots exclusion standard

Known as: Robot Exclusion Protocol, Robots exclusion protocol, Robots exclusion file 
The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2012
2012
Search engines are an everyday tool for Internet surfing. They are also a critical factor that affects e-business performance… 
2009
2009
This article studies the rise of copy-reliant technologies - technologies such as Internet search engines and plagiarism… 
Review
2008
Review
2008
A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots… 
2007
2007
The North Carolina State Archives and State Library of North Carolina collaborated to develop the North Carolina State Government… 
Highly Cited
2006
Highly Cited
2006
Academic researchers access commercial web sites to collect research data. This research practice is likely to increase. Is this… 
2006
2006
Robots Exclusion standard [4] is a de-facto standard that is used to inform the crawlers, spiders or web robots about the… 
2005
2005
Recently, Kozen has proposed a framework based on Kleene algebra with tests for verifying that a program satisfies a security… 
Highly Cited
2004
Highly Cited
2004
Web robots are software programs that automatically traverse the hyperlink structure of the World Wide Web in order to locate and… 
1999
1999
One of the key components of current Web search engines is the document collector. The paper describes CoBWeb, an automatic…