Synopsis: A Distributed Sketch over Voluminous Spatiotemporal Observational Streams

Abstract

Networked observational devices have proliferated in recent years, contributing to voluminous data streams from a variety of sources and problem domains. These streams often have a spatiotemporal component and include multidimensional <italic>features</italic> of interest. Processing such data in an offline fashion using batch systems or data warehouses is costly from both a storage and computational standpoint, and in many situations the insights derived from the data streams are useful only if they are timely. In this study, we propose <sc>Synopsis</sc>, an online, distributed <italic>sketch</italic> that is constructed from voluminous spatiotemporal data streams. The sketch summarizes feature values and inter-feature relationships in memory to facilitate real-time query evaluations and to serve as input to computations expressed using analytical engines. As the data streams evolve, <sc>Synopsis</sc> performs targeted dynamic scaling to ensure high accuracy and effective resource utilization. We evaluate our system in the context of two real-world spatiotemporal datasets and demonstrate its efficacy in both scalability and query evaluations.

DOI: 10.1109/TKDE.2017.2734661

18 Figures and Tables

Cite this paper

@article{Buddhika2017SynopsisAD, title={Synopsis: A Distributed Sketch over Voluminous Spatiotemporal Observational Streams}, author={Thilina Buddhika and Matthew Malensek and Sangmi Lee Pallickara and Shrideep Pallickara}, journal={IEEE Transactions on Knowledge and Data Engineering}, year={2017}, volume={29}, pages={2552-2566} }