Apache Hadoop

Known as: HDFS, Hadoop YARN, Hadoop Distributed Filesystem

Apache Hadoop (pronunciation: /həˈduːp/) is an open-source software framework for distributed storage and distributed processing of very large data…

Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.

Highly Cited

2016

Highly Cited

2016

An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication

Highly Cited

2014

Highly Cited

2014

COSHH: A classification and optimization based scheduler for heterogeneous Hadoop systems

2013

Big data analysis using Apache Hadoop

Jyoti NandimathEkata BanerjeeAnkur PatilPratima KakadeS. Vaidya
IEEE International Conference on Information…
2013
Corpus ID: 14888045

The paradigm of processing huge datasets has been shifted from centralized architecture to distributed architecture. As the…

Highly Cited

2013

Highly Cited

2013

Performance Overhead among Three Hypervisors: An Experimental Study Using Hadoop Benchmarks

Jack LiQingyang WangDeepal JayasingheJunhee ParkT. ZhuC. Pu
IEEE International Congress on Big Data
2013
Corpus ID: 16139464

Hyper visors are widely used in cloud environments and their impact on application performance has been a topic of significant…

Highly Cited

2012

Highly Cited

2012

Shared disk big data analytics with Apache Hadoop

Anirban MukherjeeJ. DattaRaghavendra JorapurRavi SinghviS. HaloiWasim Akram
International Conference on High Performance…
2012
Corpus ID: 18511020

Big Data is a term applied to data sets whose size is beyond the ability of traditional software technologies to capture, store…

Highly Cited

2012

Highly Cited

2012

A Parallel Genetic Algorithm Based on Hadoop MapReduce for the Automatic Generation of JUnit Test Suites

Linda Di GeronimoF. FerrucciAlfonso MuroloFederica Sarro
IEEE Fifth International Conference on Software…
2012
Corpus ID: 9314009

Software testing represents one of the most explored fields of application of Search-Based techniques and a range of testing…

Highly Cited

2011

Highly Cited

2011

Play It Again, SimMR!

Abhishek VermaL. CherkasovaR. Campbell
IEEE International Conference on Cluster…
2011
Corpus ID: 13893770

A typical MapReduce cluster is shared among different users and multiple applications. A challenging problem in such shared…

Highly Cited

2011

Highly Cited

2011

Comparing High Level MapReduce Query Languages

Robert J. StewartP. TrinderHans-Wolfgang Loidl
Advanced Parallel Programming Technologies
2011
Corpus ID: 829167

The MapReduce parallel computational model is of increasing importance. A number of High Level Query Languages (HLQLs) have been…

Highly Cited

2010

Highly Cited

2010

YETI on the Cloud

M. OriolFaheem Ullah
Third International Conference on Software…
2010
Corpus ID: 17080036

The York Extensible Testing Infrastructure (YETI) is an automated random testing tool that allows to test programs written in…

Highly Cited

2001

Highly Cited

2001

Color and Number Counts

P. SaraccoE. Giallongo E. Vanzella
2001
Corpus ID: 16018505

We present near-IR (J and Ks) number counts and colors of galaxies detected in deep VLT-ISAAC images centered on the Chandra Deep…

Apache Hadoop

Related topics

Papers overview