Paraschos Koutris

Learn More
Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given(More)
We study the problem of computing a conjunctive query q in parallel, using p of servers, on a large database. We consider algorithms with one round of communication, and study the complexity of the communication. We are especially interested in the case where the data is skewed, which is a major challenge for scalable parallel query processing. We establish(More)
The availability of large data centers with tens of thousands of servers has led to the popular adoption of massive parallelism for data analysis on large datasets. Several query languages exist for running queries on massively parallel architectures, some based on the MapReduce infrastructure, others using proprietary implementations. Motivated by this(More)
In this paper, we design and analyze parallel algorithms for skyline queries. The skyline of a multidimensional set consists of the points for which no other point exists that is at least as good along every dimension. As a framework for parallel computation, we use both the MP model proposed in (Koutris and Suciu, PODS 2011), which requires that the data(More)
We develop a new pricing system, QueryMarket, for flexible query pricing in a data market based on an earlier theoretical framework (Koutris et al., PODS 2012). To build such a system, we show how to use an Integer Linear Programming formulation of the pricing problem for a large class of queries, even when pricing is computationally hard. Further, we(More)
In this demonstration, we will showcase Myria, our novel cloud service for big data management and analytics designed to improve productivity. Myria's goal is for users to simply upload their data and for the system to help them be self-sufficient data science experts on their data -- self-serve analytics. Using a web browser, Myria users can upload data,(More)
We explore the design and implementation of a scalable Datalog system using Hadoop as the underlying runtime system. Observing that several successful projects provide a relational algebra-based programming interface to Hadoop, we argue that a natural extension is to add recursion to support scalable social network analysis, internet traffic analysis, and(More)
In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with p servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in(More)