Learn More
Océano is a prototype of a highly available, scaleable, and manageable infrastructure for an e-business computing utility. It enables multiple customers to be hosted on a collection of sequentially shared resources. The hosting environment is divided into secure domains, each supporting one customer. These domains are dynamic: the resources assigned to them(More)
Network management systems built on a client/server model centralize responsibilities in client manager processes, with server agents playing restrictive support roles. As a result, managers must micro-manage agents through primitive steps, resulting in ineffective distribution of management responsibilities, failure-prone management bottlenecks, and(More)
Process groups in distributed applications and services rely on failure detectors to detect process failures <i>completely</i>, and as <i>quickly, accurately</i>, and <i>scalably</i> as possible, even in the face of unreliable message deliveries. In this paper, we look at quantifying the optimal scalability, in terms of network load, (in messages per(More)
We present Neptune - the resource director of Oc&#233;ano, a policy driven fabric management system that dynamically reconfigures resources in a computing utility cluster. Neptune implements an on-line control mechanism subject to policy-based performance and resource configuration objectives. Neptune reassigns servers and bandwidth among a set of service(More)
Concert/C is a new language for distributed C programming that extends ANSI C to support distribution and process dynamics. Concert/C provides the ability to create and terminate processes, connect them together, and communicate among them. It supports transparent remote function calls (RPC) and asynchronous messages. Interprocess communications interfaces(More)
Network Dispatcher ND is a software tool that routes" TCP connections to multiple TCP servers that share their workload. It exports a set of virtual IP addresses that are concealed and shared by the servers. It implements a novel dynamic load-sharing algorithm for allocation of TCP connections among servers according to their real-time load and(More)
Device failures, performance ineeciencies, and security compromises are some of the problems associated with the operations of networked systems. EEective management requires monitoring, interpreting, and controlling the behavior of the distributed resources. Current management systems pursue a platform-centered paradigm, where agents monitor the system and(More)