• Publications
  • Influence
Unreliable failure detectors for reliable distributed systems
TLDR
We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. Expand
  • 2,775
  • 486
  • PDF
Bigtable: A Distributed Storage System for Structured Data
TLDR
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Expand
  • 4,891
  • 391
  • PDF
The weakest failure detector for solving consensus
TLDR
We determine what information about failures is necessary and sufficient to solve Consensus in asynchronous distributed systems subject to crash failures. Expand
  • 813
  • 114
An efficient multicast protocol for content-based publish-subscribe systems
TLDR
We develop and evaluate a novel and efficient distributed algorithm for this purpose, called -link matching". Expand
  • 600
  • 40
  • PDF
The weakest failure detector for solving consensus
TLDR
We determine what information about failures is necessary and sufficient to solve Consensus in asynchronous distributed systems subject to crash failures. Expand
  • 318
  • 40
  • PDF
Matching events in a content-based subscription system
TLDR
We present an efficient, scalable solution to the matching problem for content-based subscription systems that is sub-linear in the number of subscriptions, and it has a space complexity that is linear. Expand
  • 726
  • 37
  • PDF
Fault-tolerant wait-free shared objects
TLDR
We show that all responsive failure modes can be tolerated. Expand
  • 103
  • 15
  • PDF
A Case for Message Oriented Middleware
TLDR
We propose an approach for developing this glue technology based on message flows and discuss the open research problems in realizing this approach. Expand
  • 179
  • 14
  • PDF
On scalable and efficient distributed failure detectors
TLDR
In this paper, we look at quantifying the optimal scalability, in terms of network load, (in messages per second, with messages having a size limit) of distributed, complete failure detectors as a function of application-specified requirements. Expand
  • 193
  • 12
  • PDF
Polylog randomized wait-free consensus
TLDR
I present the first randomized wait-free implementation of consensus from multiple writer zmltiple reader register in which each process takes polylog (0(log2 n)) expected steps. Expand
  • 65
  • 10