Learn More
— Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance(More)
Large-scale distributed systems are hard to deploy, and distributed hash tables (DHTs) are no exception. To lower the barriers facing DHT-based applications, we have created a public DHT service called OpenDHT. Designing a DHT that can be widely shared, both among mutually untrusting clients and among a variety of applications, poses two distinct(More)
Industry experience indicates that the ability to incre-mentally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect which, by adopting a random graph topology, yields itself naturally to incremental(More)
We give improved approximation algorithms for a variety of latency minimization problems. In particular, we give a 3.59 1-approximation to the minimum latency problem, improving on previous algorithms by a multi-plicative factor of 2. Our techniques also give similar improvements for related problems like k-traveling re-pairmen and its multiple depot(More)
— Most P2P systems that provide a DHT abstraction distribute objects randomly among " peer nodes " in a way that results in some nodes having Θ(log N) times as many objects as the average node. Further imbalance may result due to non-uniform distribution of objects in the identifier space and a high degree of heterogeneity in object loads and node(More)
Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (<b>PDQ</b>) flow scheduling, a protocol designed to complete flows quickly and meet flow(More)
Networks are complex and prone to bugs. Existing tools that check configuration files and data-plane state operate offline at timescales of seconds to hours, and cannot detect or prevent bugs as they arise. Is it possible to <i>check network-wide invariants in real time</i>, as the network state evolves? The key challenge here is to achieve extremely low(More)
A pervasive requirement of distributed systems is to deal with churn-change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use.First, we provide a comparison of the(More)
Diagnosing problems in networks is a time-consuming and error-prone process. Existing tools to assist operators primarily focus on analyzing control plane configuration. Configuration analysis is limited in that it cannot find bugs in router software, and is harder to generalize across protocols since it must model complex configuration languages and(More)