Interval matrix multiplication on parallel architectures - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Interval matrix multiplication on parallel architectures

Résumé

Getting efficiency when implementing interval arithmetic computations is a difficult task. The work presented here deals with the efficient implementation of interval matrix multiplication on parallel architectures. A first issue is the choice of the formulas. The main principle we adopted consists in resorting, as often as possible, to optimized routines such as the BLAS3, as implemented in Intel's MKL for instance. To do so, the formulas chosen to implement interval arithmetic operations are based on the representation of intervals by their midpoint and radius. This approach has been advocated by S. Rump in 1999 and used in particular in his implementation IntLab. It is recalled that a panel of formulas for operations using the midpoint-radius representation exists: exact formulas can be found in A. Neumaier's book: "Interval Methods for Systems of Equations" (1990), in his paper in 1999 S. Rump gave approximate formulas with less operations, H.D. Nguyen in his Ph.D. thesis in 2011 gave a choice of formulas reaching various tradeoffs in terms of operation count and accuracy. These formulas for the addition and multiplication of two intervals are used in the classical formulas for matrix multiplication and can be expressed as operations (addition and multiplication) of matrices of real numbers (either midpoints or radii), S. Rump recapitulates some such matrix expressions in a recent paper (2012). In this presentation, the merits of each approach are discussed, in terms of number of elementary operations, use of BLAS3 routines for the matrix multiplication, and of accuracy. The comparison of the relative accuracies are based on the assumption that arithmetic operations are implemented using exact arithmetic. We also give a comparison of these accuracies, assuming that arithmetic operations are implemented using floating-point arithmetic. A second issue concerns the adaptation to the architecture. Indeed, the architectures targetted in this study are parallel architectures such as multicores or GPU. When implemented on such architectures, some measures such as the arithmetic operations count are no more relevant: the measured execution times do not relate directly to the operations count. This is explained by considerations on memory usage, multithreaded computations... We will show some experiments that take these architectural parameters into account and reach good performances. We will give some tradeoffs between the memory consumption and memory traffic: it can for instance be beneficial to copy (parts of) the involved matrices in the right caches to avoid cache misses and heavy traffic.
Fichier non déposé

Dates et versions

hal-00750022 , version 1 (08-11-2012)

Identifiants

  • HAL Id : hal-00750022 , version 1

Citer

Philippe Théveny, Nathalie Revol. Interval matrix multiplication on parallel architectures. SCAN 2012: 15th GAMM-IMACS International Symposium on Scientific Computing, Computer Arithmetic and Verified Numerical Computations, Sep 2012, Novosibirsk, Russia. ⟨hal-00750022⟩
135 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More