KTV-Tree: Interactive Top-K Aggregation on Dynamic Large Dataset in the Cloud

@article{Tang2015KTVTreeIT,
  title={KTV-Tree: Interactive Top-K Aggregation on Dynamic Large Dataset in the Cloud},
  author={Yuzhe Richard Tang and Ling Liu and Jun'ichi Tatemura and Hakan Hacig{\"u}m{\"u}s},
  journal={2015 IEEE 35th International Conference on Distributed Computing Systems Workshops},
  year={2015},
  pages={136-141}
}
  • Y. TangLing Liu Hakan Hacigümüs
  • Published 29 June 2015
  • Computer Science
  • 2015 IEEE 35th International Conference on Distributed Computing Systems Workshops
This paper studies the problem of supporting interactive top-k aggregation query over dynamic data in the cloud. We propose TV-TREE, a top-K Threshold-based materialized View TREE, which achieves the fast processing of top-k aggregation queries by efficiently materialized views. A segment tree based structure is adopted to organize the views in a hierarchical manner. A suite of protocols are proposed for incrementally maintaining the views. Experiments are performed for evaluating the… 

Figures from this paper

References

SHOWING 1-10 OF 14 REFERENCES

Distributed Segment Tree: Support of Range Query and Cover Query over DHT

Range query, which is defined as to find all the keys in a certain range over the underlying P2P network, has received a lot of research attentions recently. However, cover query, which is to find

PIQL: Success-Tolerant Query Processing in the Cloud

This paper proposes PIQL, a declarative language that also provides scale independence by calculating an upper bound on the number of key/value store operations that will be performed for any query, and presents the PIQL query processing system and evaluates its scale independence on hundreds of machines using two benchmarks.

Deferred Lightweight Indexing for Log-Structured Key-Value Stores

DELI is presented, a DEferred Lightweight Indexing scheme on the log-structured key-value stores that optimizes the performance of index garbage collection through tightly coupling its execution with a native routine process called compaction.

Optimal aggregation algorithms for middleware

An elegant and remarkably simple algorithm is analyzed that is optimal in a much stronger sense than FA, and is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability sense, but over every database.

Dynamo: amazon's highly available key-value store

D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.

MapReduce: simplified data processing on large clusters

This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

Hive - a petabyte scale data warehouse using Hadoop

Hive is presented, an open-source data warehousing solution built on top of Hadoop that supports queries expressed in a SQL-like declarative language - HiveQL, which are compiled into map-reduce jobs that are executed using Hadoops.

Pig latin: a not-so-foreign language for data processing

A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use.

Dryad: distributed data-parallel programs from sequential building blocks

The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.

Computational geometry: algorithms and applications

This introduction to computational geometry focuses on algorithms as all techniques are related to particular applications in robotics, graphics, CAD/CAM, and geographic information systems.