• Publications
  • Influence
Efficient query evaluation on probabilistic databases
TLDR
It is shown that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods, and an optimization algorithm is described that can compute efficiently most queries. Expand
Adversarial classification
TLDR
This paper views classification as a game between the classifier and the adversary, and produces a classifier that is optimal given the adversary's optimal strategy, and experiments show that this approach can greatly outperform a classifiers learned in the standard way. Expand
Management of probabilistic data: foundations and challenges
TLDR
The foundations of managing data where the uncertainties are quantified as probabilities are described, and some fundamental theoretical result for query evaluation on probabilistic databases is presented. Expand
Crowdsourcing Algorithms for Entity Resolution
TLDR
This paper considers the problem of designing optimal strategies for asking questions to humans that minimize the expected number of questions asked, and analyzes several strategies that can be claimed as "optimal" for this problem in a recent work but can perform arbitrarily bad in theory. Expand
Aggregating crowdsourced binary ratings
TLDR
This paper obtains bounds on the error rate of the algorithm and shows it is governed by the expansion of the graph, and demonstrates, using several synthetic and real datasets, that the algorithm outperforms the state of the art. Expand
Efficient Top-k Query Evaluation on Probabilistic Data
TLDR
This paper describes a novel approach, which computes and ranks efficiently the top-k answers to a SQL query on a probabilistic database, which is to run in parallel several Monte-Carlo simulations, one for each candidate answer, and approximate each probability only to the extent needed to compute correctly the top -k answers. Expand
The dichotomy of conjunctive queries on probabilistic structures
TLDR
The dichotomy property is a fundamental result on query evaluation on probabilistic databases and it gives a complete classification of the complexity of conjunctive queries. Expand
The dichotomy of probabilistic inference for unions of conjunctive queries
TLDR
This work considers unions of conjunctive queries, UCQ, which are equivalent to positive, existential First Order Logic sentences, and also to nonrecursive datalog programs, and proves the following dichotomy theorem. Expand
Robust Cardinality and Cost Estimation for Skyline Operator
TLDR
Robust techniques to estimate the cardinality and the computational cost of Skyline are proposed and through an empirical comparison, it is shown that this technique is substantially more effective than traditional approaches. Expand
Probabilistic databases: diamonds in the dirt
TLDR
Treasures abound from hidden facts found in imprecise data sets, according to research published in Science magazine in 2016. Expand
...
1
2
3
4
5
...