What is checkpointing? (2024)

What is Checkpointing?

Checkpointingis the process of periodically saving (or writing) the execution state of an application such that in the event of an interruption in the execution of an application, this saved state can be used to continue the execution at a later time. Typically, the execution state is written to a file.Resuming the execution of an application using a previously saved state or checkpoint (instead of starting it from scratch) is referred to as theRestartphase.

What are theadvantagesofcheckpointing?

Checkpointing not only saves time by offering the capability to resume the execution of an application in case of a hardware failure in the underlying computing platform (e.g., network interconnect failure) or if the computing platform becomes unavailable due to emergency maintenance, but it also helps in overcoming the time-limits associated with the different job queues/partitions.

What are the different types of checkpointing?

The different types of checkpointing include system-level checkpointing, application-level checkpointing, and user-level or library-level checkpointing.

System-Level checkpointing involves taking core-dumps of the computational state of the machine or system on which the application is running.

Pros: It is convenient to use, no code changes needed, user only specifies the checkpointing frequency.

Cons: It involves large memory-footprint of checkpoints as the entire execution state of the application and the operating system processes are saved during checkpointing, and system administrator level privileges are needed for installation of additional code.

Example: Berkeley Lab Checkpointing and Restart (BLCR)

Library-Level or User-Level Checkpointing involves the use of libraries for taking checkpoints while being agnostic to kernel-level information such as process IDs.

Pros: It is useful for checkpointing applications without requiring any changes to the source-code or the operating system kernel.

Cons: The users may need to load the checkpointing library before starting their applications, and then, would need to dynamically link the loaded library to their applications. The checkpoints can have a large memory-footprint.

Example: DMTCP

Application-Level Checkpointing involves implementing the checkpoint-and-restart mechanism within the application itself. An efficient implementation of application-level checkpointing would require saving and reading the state of only those variables or data that are necessary for recreating the state of the entire application. Such variables or data are referred to as critical variables/data. As an example, consider the C code below (definition of myFct function is not included below).

int main(){

int x = 4;

Recommended by LinkedIn

💡GovCon Insights by G2Xchange | 4-24-24 G2Xchange 4 months ago
Mobile Device Data Storage Concepts Rich P. 1 year ago
NuNet Technical Update Q2 2024 NuNet 1 month ago

int y = sqrt(x);

int z, i; int j = x*y;

for (i =0; i< 100; i++){

z += j* myFct(randomNumber * i);

}

return 0;

}

In this code, "i" and "z" are critical variables as their values are updated and cannot be derived easily to recreate the execution state of the code once it is interrupted.

Pros: Application-level checkpointing does not rely on the availability of any external libraries or tools, and hence, is useful for writing portable applications.

Cons: While an efficient implementation of this technique will generate checkpoints with smaller memory footprint and incur lesser I/O overheads as compared to other types of checkpointing, the onus is on the user (or the developer) to manually implement it on a per application basis, and therefore, the users should understand the code of the applications that they are checkpointing to manually reengineer the code for inserting checkpoint-restart logic.

In case of distributed (message passingor MPI applications), a checkpoint can be written as a "central checkpoint" involving a single process (typically, the root or master or manager process in the MPI world) or a distributed checkpoint (involving multiple processes and an appropriate parallel I/O API calls and strategy).

What are the side-effectsof checkpointing?

Writing and reading the application states or checkpointsintroduces additional I/O overheads. Depending upon the frequency of checkpointing and the size of the checkpoint files, the IO overheads can add noticeable increase in the run-time and storage needs of an application.

Do you have sample code?

Here is the link to the GitHub repository containing sample code in C++ that has checkpointing and restart feature embedded in it: bsswfellowship/checkpointing at main · ritua2/bsswfellowship (github.com)

References

  1. Arora, R., Bangalore, P. & Mernik, M. A technique for non-invasive application-level checkpointing.J Supercomput57, 227–255 (2011). https://doi.org/10.1007/s11227-010-0383-5
  2. Ritu Arora, Trung Nguyen, "ITALC: Interactive Tool for Application-Level Checkpointing", HUST17 workshop at SC17, November 2017.

What is checkpointing? (2024)
Top Articles
Microsoft InTune vs Group Policy - SchoolCare
10 Landing Page Flaws that Hurt Your Conversion Rates
Victor Spizzirri Linkedin
Jordanbush Only Fans
Ati Capstone Orientation Video Quiz
Athletic Squad With Poles Crossword
Noaa Weather Philadelphia
Music Archives | Hotel Grand Bach - Hotel GrandBach
Evita Role Wsj Crossword Clue
Campaign Homecoming Queen Posters
Craigslist Chautauqua Ny
Blog:Vyond-styled rants -- List of nicknames (blog edition) (TouhouWonder version)
Summoners War Update Notes
Classroom 6x: A Game Changer In The Educational Landscape
Forest Biome
Bernie Platt, former Cherry Hill mayor and funeral home magnate, has died at 90
Violent Night Showtimes Near Century 14 Vallejo
Chase Bank Pensacola Fl
Pasco Telestaff
Sadie Sink Reveals She Struggles With Imposter Syndrome
4 Times Rihanna Showed Solidarity for Social Movements Around the World
Amerisourcebergen Thoughtspot 2023
Craigslist Pasco Kennewick Richland Washington
Tom Thumb Direct2Hr
Japanese Emoticons Stars
Lininii
Duke Energy Anderson Operations Center
Gus Floribama Shore Drugs
Tire Pro Candler
Brenda Song Wikifeet
Salons Open Near Me Today
Jay Gould co*ck
The Pretty Kitty Tanglewood
The Legacy 3: The Tree of Might – Walkthrough
Family Fare Ad Allendale Mi
Cherry Spa Madison
Trap Candy Strain Leafly
Froedtert Billing Phone Number
Leena Snoubar Net Worth
Fapello.clm
2 Pm Cdt
Craigs List Hartford
1Exquisitetaste
Gopher Hockey Forum
Mudfin Village Wow
Pike County Buy Sale And Trade
Stoughton Commuter Rail Schedule
Superecchll
28 Mm Zwart Spaanplaat Gemelamineerd (U999 ST9 Matte | RAL9005) Op Maat | Zagen Op Mm + ABS Kantenband
O.c Craigslist
Latest Posts
Article information

Author: Amb. Frankie Simonis

Last Updated:

Views: 5644

Rating: 4.6 / 5 (56 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Amb. Frankie Simonis

Birthday: 1998-02-19

Address: 64841 Delmar Isle, North Wiley, OR 74073

Phone: +17844167847676

Job: Forward IT Agent

Hobby: LARPing, Kitesurfing, Sewing, Digital arts, Sand art, Gardening, Dance

Introduction: My name is Amb. Frankie Simonis, I am a hilarious, enchanting, energetic, cooperative, innocent, cute, joyous person who loves writing and wants to share my knowledge and understanding with you.