Measuring the Performance of Data Placement Structures for MapReduce-based Data Warehousing Systems
@article{Makki2018MeasuringTP, title={Measuring the Performance of Data Placement Structures for MapReduce-based Data Warehousing Systems}, author={S. K. Makki and M. R. Hasan}, journal={International journal of new computer architectures and their applications}, year={2018}, volume={8}, pages={11-20} }
The exponential growth of data requires systems that are able to provide a scalable and fault-tolerant infrastructure for storage and processing of vast amount of data efficiently. Hive is a MapReduce-based data warehouse for data aggregation and query analysis. This data warehousing system can arrange millions of rows of data into tables, and its data placement structures play a significant role for increasing the performance of this data warehouse. Hive also provides SQL-like language called… CONTINUE READING
Figures and Topics from this paper
One Citation
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
- Computer Science
- ArXiv
- 2019
- Highly Influenced
- PDF
References
SHOWING 1-8 OF 8 REFERENCES
RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems
- Computer Science
- 2011 IEEE 27th International Conference on Data Engineering
- 2011
- 262
- Highly Influential
- PDF
Hive - a petabyte scale data warehouse using Hadoop
- Computer Science
- 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)
- 2010
- 920
- PDF
Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters
- Computer Science
- Proc. VLDB Endow.
- 2013
- 18
- PDF
Hadoop: The definitive guide (Vol
- 2015
Hadoop : The definitive guide ( Vol . 54 )
- 2015
Scaling the Facebook data warehouse to 300 PB
- 2014
The Data Explosion in 2014 Minute by Minute -Infographic