Learn More
Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given(More)
We present ConfErr, a tool for testing and quantifying the resilience of software systems to human-induced configuration errors. ConfErr uses human error models rooted in psychology and linguistics to generate realistic configuration mistakes; it then injects these mistakes and measures their effects, producing a resilience profile of the system under test.(More)
We address the problem of making online, parallel query plans fault-tolerant: i.e., provide intra-query fault-tolerance without blocking. We develop an approach that not only achieves this goal but does so through the use of different fault-tolerance techniques at different operators within a query plan. Enabling each operator to use a different(More)
Data-management-as-a-service systems are increasingly being used in collaborative settings, where multiple users access common datasets. Cloud providers have the choice to implement various optimizations, such as indexing or materialized views, to accelerate queries over these datasets. Each optimization carries a cost and may benefit multiple users. This(More)
Increasingly data is being bought and sold online. To facilitate such transactions, online data marketplaces have emerged to provide a service for sellers to price views on their data, and buyers to buy such views. These marketplaces neither support the sale of ad-hoc queries (that are not one of the specified views), nor do they support queries that join(More)
We develop a new pricing system, QueryMarket, for flexible query pricing in a data market based on an earlier theoretical framework (Koutris et al., PODS 2012). To build such a system, we show how to use an Integer Linear Programming formulation of the pricing problem for a large class of queries, even when pricing is computationally hard. Further, we(More)
Data confidentiality is one of the main concerns for users of public cloud services. The key problem is protecting sensitive data from being accessed by cloud administrators who have root privileges and can remotely inspect the memory and disk contents of the cloud servers. While encryption is the basic mechanism that can leveraged to provide data(More)
There exists a growing market for structured data on the Internet today, and this motivates a theoretical study of how relational data should be priced. We advocate for a framework where the seller defines a pricing scheme, by essentially stipulating the price of some queries, and the buyer is allowed to purchase data expressed by any query they wish: the(More)
In this demonstration, we show-case a database management system extended with a new type of component that we call a Data Use Manager (DUM). The DUM enables DBAs to attach policies to data loaded into the DBMS. It then monitors how users query the data, flags potential policy violations, recommends possible fixes, and supports offline analysis of user(More)
In this paper, we outline steps towards supporting "data analysis on a budget" when operating in a setting where data must be bought, possibly periodically. We model the problem, and explore the design choices for analytic applications as well as potentially fruitful algorithmic techniques to reduce the cost of acquiring data. Simulations suggest that an(More)