Theft-Induced Checkpointing for Reconfigurable Dataflow Applications - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2005

Theft-Induced Checkpointing for Reconfigurable Dataflow Applications

Samir Jafar
  • Fonction : Auteur
  • PersonId : 834308
Jean-Louis Roch

Résumé

n this paper a new checkpoint/recovery protocol called theft-induced checkpointing is defined for dataflow computations in large heterogeneous environments. The protocol is especially useful in massively parallel multi-threaded computations as found in cluster or grid computing and utilizes the principle of work-stealing to distribute work. By basing the state of executions on a macro dataflow graph, the protocol shows extreme flexibility with respect to rollback. Specifically, it allows local rollback in dynamic heterogeneous systems, even under a different number of processors and processes. To maximize run-time efficiency, the overhead associated with checkpointing is shifted to the rollback operations whenever possible. Experimental results show the overhead induced is very small
Fichier non déposé

Dates et versions

hal-00683887 , version 1 (30-03-2012)

Identifiants

Citer

Samir Jafar, Axel W. Krings, Thierry Gautier, Jean-Louis Roch. Theft-Induced Checkpointing for Reconfigurable Dataflow Applications. IEEE Electro/Information Technology Conference (EIT 2005), May 2005, Lincoln, United States. ⟨10.1109/EIT.2005.1626998⟩. ⟨hal-00683887⟩
70 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More