Algorithm-based fault tolerance applied to P2P computing networks - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Algorithm-based fault tolerance applied to P2P computing networks

Résumé

P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a generalisation of the previous disk-less checkpointing approach for fault-tolerance in High Performance Computing systems. Our contribution is in two di- rections: first, instead of restricting to 2D checksums that tolerate only a small number of node failures, we propose to base disk-less checkpointing on linear codes to tolerate potentially a large number of faults. Then, we compare and analyse the use of Low Density Parity Check (LDPC) to classical Reed-Solomon (RS) codes with respect to different fault models to fit P2P systems. Our LDPC disk-less checkpointing method is well suited when only node disconnections are considered, but cannot deal with byzantine peers. Our RS disk-less checkpointing method tolerates such byzantine errors, but is restricted to exact finite field computations.
Fichier principal
Vignette du fichier
2009-06-ap2ps.pdf (145.55 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00786217 , version 1 (08-02-2013)

Identifiants

Citer

Thomas Roche, Jean-Louis Roch, Mathieu Cunche. Algorithm-based fault tolerance applied to P2P computing networks. IEEE First International Conference on Advances in P2P Systems, Oct 2009, Sliema, Malta. pp.144 - 149, ⟨10.1109/AP2PS.2009.30⟩. ⟨hal-00786217⟩
207 Consultations
143 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More