• Publications
  • Influence
Provenance-Aware Storage Systems
TLDR
It is shown that with reasonable overhead, a Provenance-Aware Storage System can provide useful functionality not available in today's file systems or provenance management systems.
Network-Aware Operator Placement for Stream-Processing Systems
TLDR
A stream-based overlay network (SBON) is described, a layer between a stream-processing system and the physical network that manages operator placement for stream- processing systems, which permits decentralized, large-scale multi-query optimization decisions.
Network Coordinates in the Wild
TLDR
A long-term study of a subset of a million-plus node coordinate system found that it exhibited some of the problems for which network coordinates are frequently criticized, for example, inaccuracy and fragility in the presence of violations of the triangle inequality.
Berkeley DB
TLDR
The design and technical features of Berkeley DB, the distribution, and its license are described, including surviving system and disk crashes.
World Wide Web Cache Consistency
TLDR
Using trace-driven simulation, it is shown that a weak cache consistency protocol (the one used in the Alex ftp cache) reduces network bandwidth consumption and server load more than either time-to-live fields or an invalidation protocol and can be tuned to return stale data less than 5% of the time.
Disk Scheduling Revisited
TLDR
This work has analyzed traditional disk scheduling techniques, which attempt to optimize head movement and guarantee fairness in response time, in the presence of long disk queues and proposes two algorithms which take rotational latency into account.
An Implementation of a Log-Structured File System for UNIX
TLDR
This paper presents a redesign and implementation of the Sprite, a log-structured file system that is more robust and integrated into the vnode interface that is superior to the 4BSD Fast File System (FFS) in a variety of benchmarks and not significantly less than FFS in any test.
LLAMA: Efficient graph analytics using Large Multiversioned Arrays
TLDR
The evaluation shows that LLAMA's mutability introduces modest overheads of 3–18% relative to immutable CSR for in-memory execution and that it outperforms state- of-the-art out-of-memory systems in most cases, with a best case improvement of 5x on breadth-first-search.
Dealing with disaster: surviving misbehaved kernel extensions
TLDR
This paper explains how VINO uses software fault isolation as its safety mechanism and a lightweight transaction system to cope with resource-hoarding and finds that while the overhead of these techniques is high relative to the cost of the extensions themselves, it is lowrelative to the benefits that extensibility brings.
Scalable Bayesian Rule Lists
TLDR
An algorithm for building probabilistic rule lists that is two orders of magnitude faster than previous work and optimizes the posterior of a Bayesian hierarchical model over rule lists.
...
...