Skip to search formSkip to main contentSkip to account menu

Apache Hadoop

Known as: HDFS, Hadoop YARN, Hadoop Distributed Filesystem 
Apache Hadoop (pronunciation: /həˈduːp/) is an open-source software framework for distributed storage and distributed processing of very large data… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Review
2015
Review
2015
  • 2015
  • Corpus ID: 64185647
If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step… 
Highly Cited
2012
Highly Cited
2012
Software testing represents one of the most explored fields of application of Search-Based techniques and a range of testing… 
Highly Cited
2012
Highly Cited
2012
Big Data is a term applied to data sets whose size is beyond the ability of traditional software technologies to capture, store… 
Highly Cited
2012
Highly Cited
2012
In this paper we address the problem of using MapReduce to sample a massive data set in order to produce a fixed-size sample… 
Highly Cited
2011
Highly Cited
2011
A typical MapReduce cluster is shared among different users and multiple applications. A challenging problem in such shared… 
Highly Cited
2011
Highly Cited
2011
The MapReduce parallel computational model is of increasing importance. A number of High Level Query Languages (HLQLs) have been… 
Highly Cited
2009
Highly Cited
2009
This paper describes CloudWF, a scalable and lightweight computational workflow system for clouds on top of Hadoop. CloudWF can… 
Highly Cited
2001
Highly Cited
2001
We present near-IR (J and Ks) number counts and colors of galaxies detected in deep VLT-ISAAC images centered on the Chandra Deep…