Skip to search formSkip to main contentSkip to account menu

Apache Nutch

Known as: Fetcher, Nutch 
Apache Nutch is a highly extensible and scalable open source web crawler software project.
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Review
2013
Review
2013
Perform web crawling and apply data mining in your application Overview Learn to run your application on single as well as… 
2013
2013
Huge amount of information available on internet makes it difficult for the user to get the exact search results according to his… 
2012
2012
The article presents the experiments carried out as part of the participation in the Tweet Contextualization (TC) track of INEX… 
2011
2011
. This paper reports about the development of a Plagiarism detection system as a part of the Plagiarism detection task in PAN… 
2011
2011
The massive adoption of social media has provided new ways for individuals to express their opinions online. The blogosphere, an… 
2010
2010
The trend in hardware design is towards implementing a complete system, intended for various applications, on a single chip. In… 
2009
2009
Mainly discusses all kinds of chinese information processing problem existing in Nutch,modifies and realizes the function of… 
2008
2008
The overload of information makes it too time consuming for common users to find what they really want in time. But it is easier… 
1999
1999
Tundra ecosystems appear to recover slowly from disturbance, but little long-term data concerning plant diversity has been… 
Highly Cited
1989
Highly Cited
1989
A super-scalar processor is one that is capable of sustaining an instruction-execution rate of more than one instruction per…