An architecture for tolerating processor failures in shared-memory multiprocessors - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 1993

An architecture for tolerating processor failures in shared-memory multiprocessors

Michel Banâtre
  • Fonction : Auteur
  • PersonId : 833460
Alain Gefflaut
  • Fonction : Auteur
  • PersonId : 833607
Philippe Joubert
  • Fonction : Auteur
  • PersonId : 833613
Christine Morin

Résumé

In this paper, we focus on the problem of recovering processor failures in shared memory multiprocessors. We propose an architecture designed for transparently tolerating processor failures. The recoverable shared memory (RSM) in the main component of this architecture which provides a hardware supported backward error recovery mechanism. This technique copes with standard caches and cache coherence protocols and avoids rollback propagation. The performance of the architecture during normal execution is evaluated and compared with that of existing fault tolerant shared memory multiprocessors. The performance study has been conducted by simulation using address traces collected from real parallel applications.

Domaines

Autre [cs.OH]
Fichier principal
Vignette du fichier
RR-1965.pdf (348.97 Ko) Télécharger le fichier

Dates et versions

inria-00074708 , version 1 (24-05-2006)

Identifiants

  • HAL Id : inria-00074708 , version 1

Citer

Michel Banâtre, Alain Gefflaut, Philippe Joubert, Peter Lee, Christine Morin. An architecture for tolerating processor failures in shared-memory multiprocessors. [Research Report] RR-1965, INRIA. 1993. ⟨inria-00074708⟩
171 Consultations
231 Téléchargements

Partager

Gmail Facebook X LinkedIn More