Skip to search formSkip to main content
You are currently offline. Some features of the site may not work correctly.

Apache Hadoop

Known as: HDFS, Hadoop YARN, Hadoop Distributed Filesystem 
Apache Hadoop (pronunciation: /həˈduːp/) is an open-source software framework for distributed storage and distributed processing of very large data… Expand
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2015
Highly Cited
2015
Facebook recently deployed Facebook Messages, its first ever user-facing application built on the Apache Hadoop platform. Apache… Expand
Review
2015
Review
2015
Cloudera Impala is a modern, open-source MPP SQL engine architected from the ground up for the Hadoop data processing environment… Expand
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Highly Cited
2014
Highly Cited
2014
We live in on-demand, on-command Digital universe with data prolifering by Institutions, Individuals and Machines at a very high… Expand
  • figure 7
  • figure 8
  • figure 9
  • figure 10
  • figure 11
Highly Cited
2013
Highly Cited
2013
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For… Expand
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 1
Review
2011
Review
2011
Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to… Expand
Review
2010
Review
2010
  • M. Bhandarkar
  • IEEE International Symposium on Parallel…
  • 2010
  • Corpus ID: 22605354
Apache Hadoop has become the platform of choice for developing large-scale dataintensive applications. In this tutorial, we will… Expand
Highly Cited
2010
Highly Cited
2010
MapReduce and its variants have been highly successful in implementing large-scale data-intensive applications on commodity… Expand
  • figure 1
  • figure 2
Highly Cited
2010
Highly Cited
2010
While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have… Expand
  • table 1
  • figure 1
  • table 2
  • figure 2
  • figure 3
Highly Cited
2010
Highly Cited
2010
As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a… Expand
  • figure 1
  • figure 3
  • figure 2
  • figure 4
  • table 1
Highly Cited
2009
Highly Cited
2009
Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop… Expand
  • table 1-1
  • figure 1-1