Stochastic Data Acquisition for Answering Queries as Time Goes by

  title={Stochastic Data Acquisition for Answering Queries as Time Goes by},
  author={Zheng Li and Tingjian Ge},
  journal={Proc. VLDB Endow.},
Data and actions are tightly coupled. On one hand, data analysis results trigger decision making and actions. On the other hand, the action of acquiring data is the very first step in the whole data processing pipeline. Data acquisition almost always has some costs, which could be either monetary costs or computing resource costs such as sensor battery power, network transfers, or I/O costs. Using out-dated data to answer queries can avoid the data acquisition costs, but there is a penalty of… 

Figures from this paper

In-Database Machine Learning with SQL on GPUs
This work demonstrates that SQL with recursive tables makes it possible to express a complete machine learning pipeline out of data preprocessing, model training and its validation, and fine-tune GPU kernels at hardware level to allow a higher throughput and propose non-blocking synchronisation of multiple units.
Cost-efficient Data Acquisition on Online Data Marketplaces for Correlation Analysis
It is proved that the complexity of the search problem is NP-hard, and a heuristic algorithm based on Markov chain Monte Carlo (MCMC) is designed, which demonstrates the efficiency and effectiveness of the heuristic data acquisition algorithm.
Directions in Blockchain Data Management and Analytics
Several open topics are discussed that researchers could increase focus on to leverage existing capabilities of mature data and information systems, enhance data security and privacy assurances, enable analytics services on blockchain as well as across off-chain data, and make blockchain-based systems active-oriented and intelligent.


Model-Driven Data Acquisition in Sensor Networks
Toward practical query pricing with QueryMarket
This work develops a new pricing system, QueryMarket, for flexible query pricing in a data market based on an earlier theoretical framework and shows how to use an Integer Linear Programming formulation of the pricing problem for a large class of queries, even when pricing is computationally hard.
Determining the Currency of Data
A model that specifies partial currency orders in terms of simple constraints is proposed, which allows us to express what values are copied from other data sources, bearing currency Orders in those sources, in Terms of copy functions defined on correlated attributes.
Mining of Massive Datasets
Determining relevant data is key to delivering value from massive amounts of data and big data is defined less by volume which is a constantly moving target than by its ever-increasing variety, velocity, variability and complexity.
Adaptive precision setting for cached approximate values
A parameterized algorithm for adjusting the precision of cached approximations adaptively to achieve the best performance as data values, precision requirements, or workload vary, which easily outperforms previous algorithms for exact caching.
Mining of Massive Datasets
This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.
Computing the median with uncertainty
A new model for computing with uncertainty is considered which focuses on the selection function f which returns the value of the kth smallest argument, and presents optimal offline and online algorithms for this problem.
Challenges and Opportunities with Big Data
The controversies and myths surrounding Big Data are explored, to try to explore the controversies and debunk the myths around Big Data.
Planning and Acting in Partially Observable Stochastic Domains
Artificial Intelligence: A Modern Approach
The long-anticipated revision of this #1 selling book offers the most comprehensive, state of the art introduction to the theory and practice of artificial intelligence for modern applications.