A sample-and-clean framework for fast and accurate query processing on dirty data

@inproceedings{Wang2014ASF,
  title={A sample-and-clean framework for fast and accurate query processing on dirty data},
  author={Jiannan Wang and Sanjay Krishnan and Michael J. Franklin and Kenneth Y. Goldberg and Tim Kraska and Tova Milo},
  booktitle={SIGMOD Conference},
  year={2014}
}
In emerging Big Data scenarios, obtaining timely, high-quality answers to aggregate queries is difficult due to the challenges of processing and cleaning large, dirty data sets. To increase the speed of query processing, there has been a resurgence of interest in sampling-based approximate query processing (SAQP). In its usual formulation, however, SAQP does not address data cleaning at all, and in fact, exacerbates answer quality problems by introducing sampling error. In this paper, we… CONTINUE READING
Highly Cited
This paper has 75 citations. REVIEW CITATIONS
49 Extracted Citations
2 Extracted References
Similar Papers

Citing Papers

Publications influenced by this paper.

75 Citations

0102020142015201620172018
Citations per Year
Semantic Scholar estimates that this publication has 75 citations based on the available data.

See our FAQ for additional information.

Referenced Papers

Publications referenced by this paper.

Similar Papers

Loading similar papers…