KTV-Tree: Interactive Top-K Aggregation on Dynamic Large Dataset in the Cloud
@article{Tang2015KTVTreeIT, title={KTV-Tree: Interactive Top-K Aggregation on Dynamic Large Dataset in the Cloud}, author={Yuzhe Richard Tang and Ling Liu and Jun'ichi Tatemura and Hakan Hacig{\"u}m{\"u}s}, journal={2015 IEEE 35th International Conference on Distributed Computing Systems Workshops}, year={2015}, pages={136-141} }
This paper studies the problem of supporting interactive top-k aggregation query over dynamic data in the cloud. We propose TV-TREE, a top-K Threshold-based materialized View TREE, which achieves the fast processing of top-k aggregation queries by efficiently materialized views. A segment tree based structure is adopted to organize the views in a hierarchical manner. A suite of protocols are proposed for incrementally maintaining the views. Experiments are performed for evaluating the…
References
SHOWING 1-10 OF 14 REFERENCES
Distributed Segment Tree: Support of Range Query and Cover Query over DHT
- Computer ScienceIPTPS
- 2006
Range query, which is defined as to find all the keys in a certain range over the underlying P2P network, has received a lot of research attentions recently. However, cover query, which is to find…
PIQL: Success-Tolerant Query Processing in the Cloud
- Computer ScienceProc. VLDB Endow.
- 2011
This paper proposes PIQL, a declarative language that also provides scale independence by calculating an upper bound on the number of key/value store operations that will be performed for any query, and presents the PIQL query processing system and evaluates its scale independence on hundreds of machines using two benchmarks.
Deferred Lightweight Indexing for Log-Structured Key-Value Stores
- Computer Science2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
- 2015
DELI is presented, a DEferred Lightweight Indexing scheme on the log-structured key-value stores that optimizes the performance of index garbage collection through tightly coupling its execution with a native routine process called compaction.
Optimal aggregation algorithms for middleware
- Computer SciencePODS '01
- 2001
An elegant and remarkably simple algorithm is analyzed that is optimal in a much stronger sense than FA, and is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability sense, but over every database.
Dynamo: amazon's highly available key-value store
- Computer ScienceSOSP
- 2007
D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
MapReduce: simplified data processing on large clusters
- Computer ScienceCACM
- 2008
This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Hive - a petabyte scale data warehouse using Hadoop
- Computer Science2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)
- 2010
Hive is presented, an open-source data warehousing solution built on top of Hadoop that supports queries expressed in a SQL-like declarative language - HiveQL, which are compiled into map-reduce jobs that are executed using Hadoops.
Pig latin: a not-so-foreign language for data processing
- Computer ScienceSIGMOD Conference
- 2008
A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use.
Dryad: distributed data-parallel programs from sequential building blocks
- Computer ScienceEuroSys '07
- 2007
The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
Computational geometry: algorithms and applications
- Computer Science
- 1997
This introduction to computational geometry focuses on algorithms as all techniques are related to particular applications in robotics, graphics, CAD/CAM, and geographic information systems.