Learn More
Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given(More)
Database technology remains underused in science, especially in the long tail the small labs and individual researchers that collectively produce the majority of scientic output. These researchers increasingly require iterative, ad hoc analysis over ad hoc databases but cannot individually invest in the computational and intellectual infrastructure required(More)
In this paper, we design and analyze parallel algorithms for skyline queries. The skyline of a multidimensional set consists of the points for which no other point exists that is at least as good along every dimension. As a framework for parallel computation, we use both the MP model proposed in (Koutris and Suciu, PODS 2011), which requires that the data(More)
In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with p servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in(More)
In this demonstration, we will showcase Myria, our novel cloud service for big data management and analytics designed to improve productivity. Myria's goal is for users to simply upload their data and for the system to help them be self-sufficient data science experts on their data -- self-serve analytics. Using a web browser, Myria users can upload data,(More)
We study the problem of consistent query answering under primary key violations. In this setting, the relations in a database violate the key constraints and we are interested in maximal subsets of the database that satisfy the constraints, which we call repairs. For a boolean query Q, the problem CERTAINTY(Q) asks whether every such repair satisfies the(More)
Increasingly data is being bought and sold online. To facilitate such transactions, online data marketplaces have emerged to provide a service for sellers to price views on their data, and buyers to buy such views. These marketplaces neither support the sale of ad-hoc queries (that are not one of the specified views), nor do they support queries that join(More)