The design and implementation of Zap: a system for migrating computing environments

  title={The design and implementation of Zap: a system for migrating computing environments},
  author={Steven Osman and Dinesh Subhraveti and Gong Su and Jason Nieh},
We have created Zap, a novel system for transparent migration of legacy and networked applications. Zap provides a thin virtualization layer on top of the operating system that introduces pods, which are groups of processes that are provided a consistent, virtualized view of the system. This decouples processes in pods from dependencies to the host operating system and other processes on the system. By integrating Zap virtualization with a checkpoint-restart mechanism, Zap can migrate a pod of… 

Figures and Tables from this paper

Cruz: Application-Transparent Distributed Checkpoint-Restart on Standard Operating Systems

The Cruz mechanism provides comprehensive support for checkpointing and restoring application state, both at user level and within the OS, and eliminates the need to flush communication channels by exploiting the packet re-transmission behavior of TCP and existing OS support for packet filtering.

Virtualization Mechanisms for Mobility, Security and System Administration

This dissertation demonstrates that operating system virtualization is an effective method for solving many different types of computing problems by designing novel systems that make use of commodity software while solving problems that were not conceived when the software was originally written.

Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters

A ZapC Linux prototype is implemented and it is demonstrated that it provides low visualization overhead and fast checkpoint-restart times for distributed network applications without any application, library, kernel, or network protocol modifications.

Live migration of virtual machines

The design options for migrating OSes running services with liveness constraints are considered, the concept of writable working set is introduced, and the design, implementation and evaluation of high-performance OS migration built on top of the Xen VMM are presented.

Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems

A transparent mechanism for commodity operating systems that can checkpoint multiple processes in a consistent state so that they can be restarted correctly at a later time and an efficient algorithm for recording process relationships and correctly saving and restoring shared state is introduced.

Secure Isolation and Migration of Untrusted Legacy Applications

Pods are coupled with a novel checkpoint-restart mechanism which allows processes to be migrated across minor operating system kernel versions with different security patches and allows system administrators the flexibility to patch their operating systems immediately.

Dynamic Migration of Computation Through Virtualization of the Mobile Platform

These experiments demonstrate that live migration can be used to dynamically offload computation from a MID device to a nearby desktop computer, taking only 25 s over a 100 Mbps Ethernet network and approximately 40 sover an 802.11n interface with a measured throughput of 110 Mbps.

Osprey: Operating system for predictable clouds

This paper describes an alternative approach to cloud computing where all user applications on top of a single cloud operating system called Osprey, which allows dependable, predictable, and real-time computing by consistently managing all system resources and exporting relevant information to the applications.

Linux Support for Fast Transparent General Purpose Checkpoint/Restart of Multithreaded Processes in Loadable Kernel Module

The design and implementation of multithreaded process checkpoint/restart system for Linux which provide capability of dynamic extension to increase compatibility and reduce system overhead is described.

User-Space Process Virtualization in the Context of Checkpoint-Restart and Virtual Machines

This dissertation presents user-space process virtualization to decouple application processes from the external subsystems and an adaptive plugin based approach is used to implement the virtualization layers that allow the checkpoint-restart system to grow organically.



CRAK: Linux Checkpoint/Restart As a Kernel Module

CRAK is the first system for Unix/Linux that provides transparent checkpoint/restart with the following properties: (1) it does not require any modifications of existing operating system or application code and (2) it supports migrating network sockets.

Process migration: a generalized approach using a virtualizing operating system

  • Tom BoydP. Dasgupta
  • Computer Science
    Proceedings 22nd International Conference on Distributed Computing Systems
  • 2002
This paper shows how regular shrink-wrapped applications can be migrated by developing a virtualizing operating system (vOS), residing on top of Windows 2000 that injects stock applications with the virtualizing software.

Supporting ubiquitous computing with stateless consoles and computation caches

Mechanisms to support a system architecture that provides ubiquitous access to a globally distributed and essentially unlimited set of anonymous computing resources, and restructured operating systems using a new abstraction: compute capsules.

Transparent process migration: Design alternatives and the sprite implementation

The Sprite operating system is used to offload work onto idle machines, and also to evict migrated processes when idle workstations are reclaimed by their owners, providing a high degree of transparency both for migrated processes and for users.

Optimizing the migration of virtual computers

This paper shows how to quickly move the state of a running computer across a network, including the state in its disks, memory, CPU registers, and I/O devices, and calls this state a capsule, and suggests that efficient capsule migration can improve user mobility and system management.

A "persistent connection" model for mobile and distributed systems

  • Yongguang ZhangS. Dao
  • Computer Science
    Proceedings of Fourth International Conference on Computer Communications and Networks - IC3N'95
  • 1995
It is concluded that persistent connection is a convenient communication abstraction for reliable, adaptable, and reconfigurable applications.

Managing Checkpoints for Parallel Programs

CoCheck is implemented, a system for checkpointing message passing parallel programs, and the use of checkpoint servers which are specifically designed to move checkpoints from the checkpointing process, across the interconnection network, and on to stable storage are proposed.

Amoeba: a distributed operating system for the 1990s

A description is given of the Amoeba distributed operating system, which appears to users as a centralized system but has the speed, fault tolerance, security safeguards, and flexibility required for

Mobile Communication with Virtual Network Address Translation

The performance results show that VNAT has essentially no network performance overhead except when connections are migrated, in which case the overhead of the Linux prototype is less than 7 percent over a stock RedHat Linux system.

Mobility: Processes, Computers, and Agents

The author reveals how the design and implementation of a Mobile Internetworking Architecture and the challenges of Mobile Computing have changed over the past decade and how the landscape has changed since the advent of the net.