Angelos Vasilakopoulos

Learn More
In this paper, we investigate the problem of computing a multiway join in one round of MapReduce when the data may be skewed. We optimize on communication cost, i.e., the amount of data that is transferred from the mappers to the reducers. We identify join attributes values that appear very frequently, Heavy Hitters (HH). We distribute HH valued records to(More)
Handling skew is one of the major challenges in query processing. In distributed computational environments such as MapReduce, uneven distribution of the data to the servers is not desired. One of the dominant measures that we want to optimize in distributed environments is communication cost. In a MapReduce job this is the amount of data that is(More)
During the past decade, there has been an extensive investigation of the computational complexity of the consistent answers of Boolean conjunctive queries under primary key constraints. Much of this investigation has focused on self-join-free Boolean conjunctive queries. In this paper, we study the consistent answers of Boolean conjunctive queries involving(More)
AI systems typically make decisions and find patterns in data based on the computation of aggregate and specifically sum functions, expressed as queries, on data's attributes. This computation can become costly or even inefficient when these queries concern the whole or big parts of the data and especially when we are dealing with big data. New types of(More)
We propose an extension of possibilistic databases that also includes provenance. The introduction of prove-nance makes our model closed under selection with equalities, projection and join. In addition the computation of query computing with possibilities is polynomial, in contrast with current models that combine provenance with probabilities and have #P(More)
We define and investigate the computational complexity of the query containment problem for data that support both uncertainty and lineage. Query containment depends on the definition of database containment which, for traditional databases, is defined as a simple set containment for each relation. As this is not the case in the presence of uncertainty and(More)
  • 1