Skip to search formSkip to main contentSkip to account menu

Apache Hadoop

Known as: HDFS, Hadoop YARN, Hadoop Distributed Filesystem 
Apache Hadoop (pronunciation: /həˈduːp/) is an open-source software framework for distributed storage and distributed processing of very large data… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2018
Highly Cited
2018
Cloud computing is a promising distributed computing platform for big data applications, e.g., scientific applications, since… 
Review
2015
Review
2015
Model-driven engineering (MDE) often features quality assurance (QA) techniques to help developers creating software that meets… 
Highly Cited
2012
Highly Cited
2012
Software testing represents one of the most explored fields of application of Search-Based techniques and a range of testing… 
Highly Cited
2012
Highly Cited
2012
The improvement of file access performance is a great challenge in real-time cloud services. In this paper, we analyze… 
Highly Cited
2012
Highly Cited
2012
The scalability of Cloud infrastructures has significantly increased their applicability. Hadoop, which works based on a… 
Highly Cited
2012
Highly Cited
2012
Big Data is a term applied to data sets whose size is beyond the ability of traditional software technologies to capture, store… 
Highly Cited
2011
Highly Cited
2011
A typical MapReduce cluster is shared among different users and multiple applications. A challenging problem in such shared… 
Highly Cited
2009
Highly Cited
2009
The enormous growth of data in a variety of applications has increased the need for high performance data mining based on… 
Highly Cited
2009
Highly Cited
2009
Energy considerations are important for Internet datacenters operators, and MapReduce is a common Internet datacenter application… 
Highly Cited
2001
Highly Cited
2001
We present near-IR (J and Ks) number counts and colors of galaxies detected in deep VLT-ISAAC images centered on the Chandra Deep…