Learn More
— Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance(More)
Large-scale distributed systems are hard to deploy, and distributed hash tables (DHTs) are no exception. To lower the barriers facing DHT-based applications, we have created a public DHT service called OpenDHT. Designing a DHT that can be widely shared, both among mutually untrusting clients and among a variety of applications, poses two distinct(More)
We give improved approximation algorithms for a variety of latency minimization problems. In particular, we give a 3.59 1-approximation to the minimum latency problem, improving on previous algorithms by a multi-plicative factor of 2. Our techniques also give similar improvements for related problems like k-traveling re-pairmen and its multiple depot(More)
times as many objects as the average node. Further imbalance may result due to non-uniform distribution of objects in the identifier space and a high degree of heterogeneity in object loads and node capacities. Additionally, a node's load may vary greatly over time since the system can be expected to experience continuous insertions and deletions of(More)
Industry experience indicates that the ability to incre-mentally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect which, by adopting a random graph topology, yields itself naturally to incremental(More)
Networks are complex and prone to bugs. Existing tools that check configuration files and data-plane state operate offline at timescales of seconds to hours, and cannot detect or prevent bugs as they arise. Is it possible to <i>check network-wide invariants in real time</i>, as the network state evolves? The key challenge here is to achieve extremely low(More)
Today's data centers face extreme challenges in providing low latency. However, fair sharing, a principle commonly adopted in current congestion control protocols, is far from optimal for satisfying latency requirements. We propose Preemptive Distributed Quick (<b>PDQ</b>) flow scheduling, a protocol designed to complete flows quickly and meet flow(More)
TCP and its variants have suffered from surprisingly poor performance for decades. We argue the TCP family has little hope to achieve consistent high performance due to a fundamental architectural deficiency: hard-wiring packet-level events to control responses without understanding the real performance result of its actions. We propose Performance-oriented(More)
Diagnosing problems in networks is a time-consuming and error-prone process. Existing tools to assist operators primarily focus on analyzing control plane configuration. Configuration analysis is limited in that it cannot find bugs in router software, and is harder to generalize across protocols since it must model complex configuration languages and(More)
A pervasive requirement of distributed systems is to deal with churn-change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use.First, we provide a comparison of the(More)