• Publications
  • Influence
Executing SQL over encrypted data in the database-service-provider model
Rapid advances in networking and Internet technologies have fueled the emergence of the "software as a service" model for enterprise computing. Successful examples of commercially viable softwareExpand
  • 1,260
  • 105
  • Open Access
Efficient parallel set-similarity joins using MapReduce
In this paper we study how to efficiently perform set-similarity joins in parallel using the popular MapReduce framework. We propose a 3-stage approach for end-to-end set-similarity joins. We take asExpand
  • 468
  • 44
  • Open Access
Efficient Merging and Filtering Algorithms for Approximate String Searches
We study the following problem: how to efficiently find in a collection of strings those similar to a given query string? Various similarity functions can be used, such as edit distance, JaccardExpand
  • 275
  • 27
  • Open Access
Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems
Location-based information contained in publicly available GIS databases is invaluable for many applications such as disaster response, national infrastructure protection, crime analysis, andExpand
  • 249
  • 19
  • Open Access
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
Many applications need to solve the following problem of approximate string matching: from a collection of strings, how to find those similar to a given string, or the strings in another (possiblyExpand
  • 194
  • 14
  • Open Access
Efficient record linkage in large data sets
This paper describes an efficient approach to record linkage. Given two lists of records, the record-linkage problem consists of determining all pairs that are similar to each other where the overallExpand
  • 223
  • 13
  • Open Access
Relaxing join and selection queries
Database users can be frustrated by having an empty answer to a query. In this paper, we propose a framework to systematically relax queries involving joins and selections. When considering relaxingExpand
  • 118
  • 11
  • Open Access
Efficient interactive fuzzy keyword search
Traditional information systems return answers after a user submits a complete query. Users often feel "left in the dark" when they have limited knowledge about the underlying data, and have to use aExpand
  • 225
  • 10
  • Open Access
A conserved role for atlastin GTPases in regulating lipid droplet size.
Lipid droplets (LDs) are the major fat storage organelles in eukaryotic cells, but how their size is regulated is unknown. Using genetic screens in C. elegans for LD morphology defects in intestinalExpand
  • 87
  • 9
AsterixDB: A Scalable, Open Source BDMS
AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make itExpand
  • 134
  • 9
  • Open Access