Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 227,213,190 papers from all fields of science
Search
Sign In
Create Free Account
Apache Nutch
Known as:
Fetcher
, Nutch
Apache Nutch is a highly extensible and scalable open source web crawler software project.
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
15 relations
Apache Hadoop
Doug Cutting
Information extraction
Java
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Review
2013
Review
2013
Web Crawling and Data Mining with Apache Nutch
Z. Laliwala
,
A. Shaikh
2013
Corpus ID: 63747764
Perform web crawling and apply data mining in your application Overview Learn to run your application on single as well as…
Expand
2013
2013
Profile Based Search Engine
A. Mantri
,
Priyanka Nawale
,
Trupti Pardeshi
,
Rajeshwary Shisode
,
R. Pagare
2013
Corpus ID: 56367213
Huge amount of information available on internet makes it difficult for the user to get the exact search results according to his…
Expand
2012
2012
A Hybrid Tweet Contextualization System using IR and Summarization
P. Bhaskar
,
Somnath Banerjee
,
Sivaji Bandyopadhyay
Conference and Labs of the Evaluation Forum
2012
Corpus ID: 772432
The article presents the experiments carried out as part of the participation in the Tweet Contextualization (TC) track of INEX…
Expand
2011
2011
Rule Based Plagiarism Detection using Information Retrieval - Notebook for PAN at CLEF 2011
Aniruddha Ghosh
,
P. Bhaskar
,
Santanu Pal
,
Sivaji Bandyopadhyay
Conference and Labs of the Evaluation Forum
2011
Corpus ID: 14528973
. This paper reports about the development of a Plagiarism detection system as a part of the Plagiarism detection task in PAN…
Expand
2011
2011
Mapping the Blogosphere--Towards a Universal and Scalable Blog-Crawler
Philipp Berger
,
Patrick Hennig
,
Justus Bross
,
C. Meinel
IEEE Third Int'l Conference on Privacy, Security…
2011
Corpus ID: 6577620
The massive adoption of social media has provided new ways for individuals to express their opinions online. The blogosphere, an…
Expand
2010
2010
FPGA-Based Design of Controller for Sound Fetching from Codec Using Altera DE2 Board
A.R.M. Khan
,
A.P.Thakare
,
S.M.Gulhane
2010
Corpus ID: 18183331
The trend in hardware design is towards implementing a complete system, intended for various applications, on a single chip. In…
Expand
2009
2009
Research on of Chinese Problem in Nutch
Chen Jian-feng
2009
Corpus ID: 63517125
Mainly discusses all kinds of chinese information processing problem existing in Nutch,modifies and realizes the function of…
Expand
2008
2008
Real Time Recommendation Utilizing Experts' Experiences
Jingyu Sun
,
Xue-li Yu
,
Zhi-Li Wu
,
Xianhua Li
Fifth International Conference on Fuzzy Systems…
2008
Corpus ID: 6859960
The overload of information makes it too time consuming for common users to find what they really want in time. But it is easier…
Expand
1999
1999
RECOVERY OF PRODUCTIVITY AND SPECIES DIVERSITY IN TUSSOCK TUNDRA FOLLOWINGDISTURBANCE
M. Vavrek
,
N. Fetcher
,
James B. McGraw
,
G. R. Shaver
,
F. Iii
,
Brian Bovard
1999
Corpus ID: 56107194
Tundra ecosystems appear to recover slowly from disturbance, but little long-term data concerning plant diversity has been…
Expand
Highly Cited
1989
Highly Cited
1989
Super-scalar processor design
William M. Johnson
1989
Corpus ID: 18921209
A super-scalar processor is one that is capable of sustaining an instruction-execution rate of more than one instruction per…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE