Application checkpointing

Known as: CryoPID, Checkpoint, System checkpointing

Checkpointing is a technique to add fault tolerance into computing systems. It basically consists of saving a snapshot of the application's state, so…

Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.

2014

Reducing the computer memory requirement for 3D reverse-time migration with a boundary-wavefield extrapolation method

Sirui TanLianjie Huang
2014
Corpus ID: 62531400

ABSTRACTReverse-time migration (RTM) using the crosscorrelation imaging condition requires that the forward-propagated source…

2008

Zest Checkpoint storage system for large supercomputers

P. NowoczynskiN. StoneJ. YanovichJ. Sommerfield
International Parallel Data Systems Workshop
2008
Corpus ID: 6620281

The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on…

2007

Dynamic Virtual Clustering

Wesley EmenekerD. Stanzione
IEEE International Conference on Cluster…
2007
Corpus ID: 14247722

Multiple clusters co-existing in a single research campus has become commonplace at many university and government labs, but…

2007

Dynamic Scheduling with Process Migration

C. DuXian-He SunMing Wu
IEEE/ACM International Symposium on Cluster…
2007
Corpus ID: 8843985

Process migration is essential for runtime load balancing. In Grid and shared networked environments, load imbalance is not only…

Highly Cited

2006

Highly Cited

2006

Recovery Policies for Enhancing Web Services Reliability

A. ErradiP. MaheshwariV. Tosic
IEEE International Conference on Web Services…
2006
Corpus ID: 1049373

Web services are gaining acceptance as a standards-based approach for integrating loosely coupled services often distributed over…

2006

SWICH: A Prototype for Efficient Cache-Level Checkpointing and Rollback

Existing cache-level checkpointing schemes do not continuously support a large rollback window. Immediately after a checkpoint…

2002

A Distributed Parallel Programming Framework

N. StankovicKang Zhang
IEEE Trans. Software Eng.
2002
Corpus ID: 10707807

This paper presents Visper, a novel object-oriented framework that identifies and enhances common services and programming…

Highly Cited

1997

Highly Cited

1997

A fault-tolerant object service on CORBA

Deron LiangChen-Liang FangS. YuanChyouhwa ChenG. Jan
Proceedings of 17th International Conference on…
1997
Corpus ID: 10911981

There are more and more COSS (Common Object Service Specifications) on CORBA (Common Object Request Broker Architecture…

1986

Availability of a distributed computer system with failures

SummaryA model for distributed systems with failing components is presented. Each node may fail and during its recovery the load…

1983

Checkpoint and Restart in Distributed Transaction Systems

J. Moss
Symposium on Reliability in Distributed Software…
1983
Corpus ID: 7466587

Application checkpointing

Related topics

Papers overview