Design of Availability-dependent Distributed Services in Large-scale Uncooperative Settings

Abstract

Thesis Statement: Availability-dependent global predicates can be efficiently and scalably realized for a class of distributed services, in spite of specific selfish and colluding behaviors, using local and decentralized protocols. Several types of large-scale distributed systems spanning the Internet have to deal with availability variations among their constituent nodes. In dealing with churn and low availability nodes, we believe it is important to link the availability of a node to the service the node receives from the distributed system. In other words, high availability has to be incentivized with better service. There are two types of requirements for this problem. First, metrics such as message overhead, CPU usage, memory overhead and latency need to be optimized to achieve scalability and efficiency. Secondly, in open distributed systems spanning multiple organizations, the protocols have to tolerate selfish and colluding nodes, i.e., low availability nodes that attempt to receive better service. This thesis approaches this problem by explicitly linking each node’s service to its availability, via the notion of a global predicate. We present a class of novel distributed protocols that achieve a given availability-dependent global predicate, efficiently and scalably. These protocols execute in a fully decentralized manner, realizing the global predicates in an emergent fashion. Predicate satisfaction is resilient to churn, and to selfish and colluding nodes. The eventual goal of the predicates is to help incentivize nodes to improve their availability in order to get better service. Our approach includes using random and consistent techniques to build overlays, as well as probabilistic local actions such as message forwarding, monitoring, and auditing. This combination of techniques leads to realizing the predicates, and to probabilistic tolerance to failures, both churn-related as well as from selfish and colluding behaviors. Concretely, this thesis makes three major contributions that are closely related to each other. First we present AVMON, the first distributed availability monitoring service. AVMON builds random and consistent overlays for accurate and decentralized monitoring of the long term

42 Figures and Tables

Cite this paper

@inproceedings{Gupta2007DesignOA, title={Design of Availability-dependent Distributed Services in Large-scale Uncooperative Settings}, author={Indranil Gupta and Klara Nahrstedt and Bill Sanders and Anne - Marie Kermarrec}, year={2007} }