An Improved Approach for Analysis of Hadoop Data for All Files

@article{Jain2017AnIA,
  title={An Improved Approach for Analysis of Hadoop Data for All Files},
  author={Heena Jain and Ajay Goyal},
  journal={International Journal of Computer Applications},
  year={2017},
  volume={157},
  pages={15-20}
}
Here in this paper an efficient Framework is implemented for Hadoop Platform for almost all types of Files. The Proposed Methodology implemented here is based on various algorithms implemented on Hadoop Platform such as Scan, Read, Sort etc. Various Workloads are used for the Analysis of the Algorithms of small and big size such as Facebook, Co-author, and Twitter. The Experimental results show the performance of the proposed methodology. The Methodology provides efficient Running Time… Expand

Topics from this paper

Optimization of hadoop small file storage using priority model
  • V. Nivedita, J. Geetha
  • Computer Science
  • 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)
  • 2017
Performance Evaluation of Apache Hadoop Benchmarks under a Dynamic Checkpointing Mechanism
Experimentation and Analysis of Dynamic Checkpoint on Apache Hadoop with Failure Scenarios
Employment of Optimal Approximations on Apache Hadoop Checkpoint Technique for Performance Improvements
Validation of a dynamic checkpoint mechanism for Apache Hadoop with failure scenarios
Política Customizada de Balanceamento de Réplicas para o HDFS Balancer do Apache Hadoop

References

SHOWING 1-10 OF 18 REFERENCES
The Hadoop Distributed File System
Fault Tolerance in Hadoop for Work Migration
SFMapReduce: An optimized MapReduce framework for Small Files
DiskReduce: RAID for data-intensive scalable computing
Towards Optimal Resource Provisioning for Running MapReduce Programs in Public Clouds
  • F. Tian, Keke Chen
  • Computer Science
  • 2011 IEEE 4th International Conference on Cloud Computing
  • 2011
MapReduce: Simplified Data Processing on Large Clusters
Oivos: Simple and Efficient Distributed Data Processing
...
1
2
...