Corpus ID: 212455695

An Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework

@inproceedings{Prasad2017AnEA,
  title={An Efficient Approach to Optimize the Performance of Massive Small Files in Hadoop MapReduce Framework},
  author={G. Prasad and Swathi C. Prabhu},
  year={2017}
}
The most popular open source distributed computing framework called Hadoop was designed by Doug Cutting and his team, which involves thousands of nodes to process and analyze huge amounts of data called Big Data. The major core components of Hadoop are HDFS (Hadoop Distributed File System) and MapReduce. This framework is the most popular and powerful for store, manage and process Big Data applications. But drawback with this tool related to stability and performance issues for small file… Expand
1 Citations

Figures and Tables from this paper

Performance Analysis of ECG Big Data using Apache Hive and Apache Pig

References

SHOWING 1-10 OF 24 REFERENCES
A novel approach to improve the performance of Hadoop in handling of small files
Improving performance of small-file accessing in Hadoop
SFMapReduce: An optimized MapReduce framework for Small Files
Improving metadata management for small files in HDFS
A novel indexing scheme for efficient handling of small files in Hadoop Distributed File System
Efficient prefetching technique for storage of heterogeneous small files in Hadoop Distributed File System Federation
Improving the Efficiency of Storing for Small Files in HDFS
Small files storing and computing optimization in Hadoop parallel rendering
...
1
2
3
...