A control-theory approach for cluster autonomic management: maximizing usage while avoiding overload - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

A control-theory approach for cluster autonomic management: maximizing usage while avoiding overload

Résumé

Cloud and HPC (High-Performance Computing) systems have increasingly become more varying in their behavior, in particular in aspects such as performance and power consumption, and the fact that they are becoming less predictable demands more runtime management. In this work, we describe results addressing autonomic administration in HPC systems for scientific workflows management through a control theoretical approach. We propose a model described by parameters related to the key aspects of the infrastructure thus achieving a deterministic dynamical representation that covers the diverse and time-varying behaviors of the real computing system. Later, we propose a model-predictive control loop to achieve two different objectives: maximize cluster utilization by best-effort jobs and control the file server's load in the presence of external disturbances. The accuracy of the prediction relies on a parameter estimation scheme based on the EKF (Extended Kalman Filter) to adjust the predictive-model to the real system, making the approach adaptive to parametric variations in the infrastructure. The closed loop strategy shows performance improvement and consequently a reduction in the total computation time. The problem is addressed in a general way, to allow the implementation on similar HPC platforms, as well as scalability to different infrastructures.
Fichier principal
Vignette du fichier
CCTA19_0092_FI.pdf (390.69 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02294272 , version 1 (23-09-2019)

Identifiants

Citer

Agustín Gabriel Yabo, Bogdan Robu, Olivier Richard, Bruno Bzeznik, Eric Rutten. A control-theory approach for cluster autonomic management: maximizing usage while avoiding overload. CCTA 2019 - 3rd IEEE Conference on Control Technology and Applications, Aug 2019, Hong Kong, China. pp.189-195, ⟨10.1109/CCTA.2019.8920473⟩. ⟨hal-02294272⟩
228 Consultations
229 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More