Johannes Gehrke

Learn More
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called <i>k</i>-anonymity has gained popularity. In a <i>k</i>-anonymized dataset, each record is indistinguishable from at least <i>k</i> &minus; 1 other records with respect to certain identifying(More)
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering(More)
We introduce a new algorithm for mining sequential patterns. Our algorithm is especially efficient when the sequential patterns in the database are very long. We introduce a novel <i>depth-first</i> search strategy that integrates a depth-first traversal of the search space with effective pruning mechanisms.Our implementation of the search strategy combines(More)
The widespread distribution and availability of small-scale sensors, actuators, and embedded processors is transforming the physical world into a computing platform. One such example is a sensor network consisting of a large number of sensor nodes that combine physical sensing capabilities such as temperature, light, or seismic sensors with networking and(More)
We present a new algorithm for mining maximal frequent itemsets from a transactional database. Our algorithm is especially efficient when the itemsets in the database are very long. The search strategy of our algorithm integrates a depth-first traversal of the itemset lattice with effective pruning mechanisms. Our implementation of the search strategy(More)
Smart sensors are small wireless computing devices that sense information such as light and humidity at extremely high resolutions. A smart sensor query-processing architecture using database technology can facilitate deployment of sensor networks. Smart-sensor technology enables a broad range of ubiquitous computing applications. Their low cost, small(More)
There has been increasing interest in the problem of building accurate data mining models over aggregate data, while protecting privacy at the level of individual records. One approach for this problem is to randomize the values in individual records, and only disclose the randomized values. The model is then built over the randomized data, after first(More)
Sensor networks are being widely deployed for measurement, detection and surveillance applications. In these new applications, users issue long-running queries over a combination of stored data and sensor data. Most existing applications rely on a centralized system for collecting sensor data. These systems lack flexibility because data is extracted in a(More)