Evaluating Hadoop for Data-Intensive Scientific Operations

  title={Evaluating Hadoop for Data-Intensive Scientific Operations},
  author={Zacharia Fadika and Madhusudhan Govindaraju and Shane Canon and Lavanya Ramakrishnan},
  journal={2012 IEEE Fifth International Conference on Cloud Computing},
Emerging sensor networks, more capable instruments, and ever increasing simulation scales are generating data at a rate that exceeds our ability to effectively manage, curate, analyze, and share it. Data-intensive computing is expected to revolutionize the next-generation software stack. Hadoop, an open source implementation of the MapReduce model provides a way for large data volumes to be seamlessly processed through use of large commodity computers. The inherent parallelization… CONTINUE READING
Highly Cited
This paper has 33 citations. REVIEW CITATIONS