Measuring the Performance of Data Placement Structures for MapReduce-based Data Warehousing Systems

@article{Makki2018MeasuringTP,
  title={Measuring the Performance of Data Placement Structures for MapReduce-based Data Warehousing Systems},
  author={S. Makki and M. R. Hasan},
  journal={International journal of new computer architectures and their applications},
  year={2018},
  volume={8},
  pages={11-20}
}
  • S. Makki, M. R. Hasan
  • Published 2018
  • Computer Science
  • International journal of new computer architectures and their applications
The exponential growth of data requires systems that are able to provide a scalable and fault-tolerant infrastructure for storage and processing of vast amount of data efficiently. Hive is a MapReduce-based data warehouse for data aggregation and query analysis. This data warehousing system can arrange millions of rows of data into tables, and its data placement structures play a significant role for increasing the performance of this data warehouse. Hive also provides SQL-like language called… Expand
1 Citations
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
  • Highly Influenced
  • PDF

References

SHOWING 1-8 OF 8 REFERENCES
RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems
  • Y. He, R. Lee, +4 authors Z. Xu
  • Computer Science
  • 2011 IEEE 27th International Conference on Data Engineering
  • 2011
  • 262
  • Highly Influential
  • PDF
Hive - a petabyte scale data warehouse using Hadoop
  • 926
  • PDF
Major technical advancements in apache hive
  • 106
  • PDF
Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters
  • 18
  • PDF
Hadoop: The definitive guide (Vol
  • 2015
Hadoop : The definitive guide ( Vol . 54 )
  • 2015
Scaling the Facebook data warehouse to 300 PB
  • 2014
The Data Explosion in 2014 Minute by Minute -Infographic