Cache-aware scheduling of scientific workflows in a multisite cloud - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Article Dans Une Revue Future Generation Computer Systems Année : 2021

Cache-aware scheduling of scientific workflows in a multisite cloud

Résumé

Many scientific experiments today are performed using scientific workflows, which become more and more data-intensive. We consider the efficient execution of such workflows in a multisite cloud, leveraging heterogeneous resources available at multiple geo-distributed data centers. Since it is common for workflow users to reuse code or data from previous workflows, a promising approach for efficient workflow execution is to cache intermediate data in order to avoid re-executing entire workflows. However, caching intermediate data and scheduling workflows to exploit such caching in a multisite cloud is complex. In particular, workflow scheduling must be cache-aware, in order to decide whether reusing cache data or re-executing workflows entirely. In this paper, we propose a solution for cache-aware scheduling of scientific workflows in a multisite cloud. Our solution includes a distributed and parallel architecture and new algorithms for adaptive caching, cache site selection, and dynamic workflow scheduling. We implemented our solution in the OpenAlea workflow system, together with cache-aware distributed scheduling algorithms. Our experimental evaluation in a three-site cloud with a real application in plant phenotyping shows that our solution can yield major performance gains, reducing total time up to 42% with 60% of the same input data for each new execution.
Fichier principal
Vignette du fichier
FGCS_2020.pdf (2.96 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03189130 , version 1 (02-04-2021)

Identifiants

Citer

Gaëtan Heidsieck, Daniel de Oliveira, Esther Pacitti, Christophe Pradal, Francois Tardieu, et al.. Cache-aware scheduling of scientific workflows in a multisite cloud. Future Generation Computer Systems, 2021, 122, pp.172-186. ⟨10.1016/j.future.2021.03.012⟩. ⟨hal-03189130⟩
154 Consultations
93 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More