Apache Hadoop

Known as: HDFS, Hadoop YARN, Hadoop Distributed Filesystem 
Apache Hadoop (pronunciation: /həˈduːp/) is an open-source software framework for distributed storage and distributed processing of very large data… (More)
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2014
2014
We live in on-demand, on-command Digital universe with data prolifering by Institutions, Individuals and Machines at a very high… (More)
  • figure 1
  • figure 3
  • figure 2
  • figure 5
  • figure 4
Is this relevant?
Highly Cited
2013
Highly Cited
2013
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 1
Is this relevant?
2013
2013
Digital video is prominent big data spread all over the Internet. It is large not only in size but also in required processing… (More)
  • figure 2
  • figure 1
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
2012
2012
Big Data is a term applied to data sets whose size is beyond the ability of traditional software technologies to capture, store… (More)
  • figure 1
  • figure 2
  • table 1
  • figure 3
  • table 2
Is this relevant?
Highly Cited
2010
Highly Cited
2010
The size of data sets being collected and analyzed in the industry for business intelligence is growing rapidly, making… (More)
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Review
2010
Review
2010
Apache Hadoop has become the platform of choice for developing large-scale dataintensive applications. In this tutorial, we will… (More)
Is this relevant?
Highly Cited
2010
Highly Cited
2010
Distributed processing frameworks, such as Yahoo!'s Hadoop and Google's MapReduce, have been successful at harnessing expansive… (More)
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2010
Highly Cited
2010
User constraints such as deadlines are important requirements that are not considered by existing cloud-based data processing… (More)
  • figure 1
  • figure 2
Is this relevant?
Highly Cited
2010
Highly Cited
2010
Many modern enterprises are collecting data at the most detailed level possible, creating data repositories ranging from… (More)
  • figure 1
  • figure 2
  • figure 3
  • table 1
  • table 2
Is this relevant?
Highly Cited
2009
Highly Cited
2009
AvroEventSerializer class, File Formats access control lists (ACLs), An example, ACLs accumulators, Accumulators ACLs (access… (More)
  • table 1-1
  • figure 2-1
  • figure 2-2
  • figure 2-3
  • figure 2-4
Is this relevant?