• Publications
  • Influence
Hive - A Warehousing Solution Over a Map-Reduce Framework
TLDR
We present Hive, an open-source data warehousing solution built on top of Hadoop. Expand
  • 1,700
  • 214
  • PDF
Hive - a petabyte scale data warehouse using Hadoop
TLDR
The size of data sets being collected and analyzed in the industry for business intelligence is growing rapidly, making traditional warehousing solutions prohibitively expensive. Expand
  • 926
  • 123
  • PDF
Data warehousing and analytics infrastructure at facebook
TLDR
Scalable analysis on large data sets has been core to the functions of a number of teams at Facebook - both engineering and non-engineering. Expand
  • 408
  • 26
  • PDF
Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS (Demo)
TLDR
This paper describes Trio-One's translation scheme and system architecture, showing how it efficiently and easily supports the Trio data model and query language. Expand
  • 90
  • 10
  • PDF
Making Aggregation Work in Uncertain and Probabilistic Databases
TLDR
We describe how aggregation is handled in the Trio system for uncertain and probabilistic data. Expand
  • 53
  • 4
  • PDF
Peregrine: Low-latency queries on Hive warehouse data
How Facebook is analyzing big data.
  • 13