In-Transit Molecular Dynamics Analysis with Apache Flink - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

In-Transit Molecular Dynamics Analysis with Apache Flink

Résumé

In this paper, an on-line parallel analytics framework is proposed to process and store in transit all the data being generated by a Molecular Dynamics (MD) simulation run using staging nodes in the same cluster executing the simulation. The implementation and deployment of such a parallel workflow with standard HPC tools, managing problems such as data partitioning and load balancing, can be a hard task for scientists. In this paper we propose to leverage Apache Flink, a scalable stream processing engine from the Big Data domain, in this HPC context. Flink enables to program analyses within a simple window based map/reduce model, while the runtime takes care of the deployment, load balancing and fault tolerance. We build a complete in transit analytics workflow, connecting an MD simulation to Apache Flink and to a distributed database, Apache HBase, to persist all the desired data. To demonstrate the expressivity of this programming model and its suitability for HPC scientific environments, two common analytics in the MD field have been implemented. We assessed the performance of this framework, concluding that it can handle simulations of sizes used in the literature while providing an effective and versatile tool for scientists to easily incorporate on-line parallel analytics in their current workflows.
Fichier principal
Vignette du fichier
paper.pdf (1.2 Mo) Télécharger le fichier
diagram.jpg (49.66 Ko) Télécharger le fichier
diagram.png (62.05 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01889939 , version 1 (08-10-2018)

Identifiants

  • HAL Id : hal-01889939 , version 1

Citer

Henrique C Zanúz, Bruno Raffin, Omar A Mures, Emilio J Padrón. In-Transit Molecular Dynamics Analysis with Apache Flink. ISAV 2018 - In Situ Infrastructures for Enabling Extreme-scale Analysis and Visualization, Nov 2018, DALLAS, United States. pp.1-8. ⟨hal-01889939⟩
202 Consultations
517 Téléchargements

Partager

Gmail Facebook X LinkedIn More