Understanding Spark Performance in Hybrid and Multi-Site Clouds - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Understanding Spark Performance in Hybrid and Multi-Site Clouds

Bogdan Nicolae
Gabriel Antoniu

Résumé

Recently, hybrid multi-site big data analytics (that combines on-premise with off-premise resources) has gained increasing popularity as a tool to process large amounts of data on-demand, without additional capital investment to increase the size of a single datacenter. However, making the most out of hybrid setups for big data analytics is challenging because on-premise resources can communicate with off-premise resources at significantly lower throughput and higher latency. Understanding the impact of this aspect is not trivial, especially in the context of modern big data an-alytics frameworks that introduce complex communication patterns and are optimized to overlap communication with computation in order to hide data transfer latencies. This paper contributes with a work-in-progress study that aims to identify and explain this impact in relationship to the known behavior on a single cloud. To this end, it analyses a representative big data workload on a hybrid Spark setup. Unlike previous experience that emphasized low end-impact of network communications in Spark, we found significant overhead in the shuffle phase when the bandwidth between the on-premise and off-premise resources is sufficiently small.
Fichier principal
Vignette du fichier
main (1).pdf (3.59 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01239140 , version 1 (14-12-2015)

Licence

Domaine public

Identifiants

  • HAL Id : hal-01239140 , version 1

Citer

Roxana-Ioana Roman, Bogdan Nicolae, Alexandru Costan, Gabriel Antoniu. Understanding Spark Performance in Hybrid and Multi-Site Clouds. BDAC-15 - 6th International Workshop on Big Data Analytics: Challenges and Opportunities (in conjunction with SC15) , Nov 2015, Austin, TX, United States. ⟨hal-01239140⟩
882 Consultations
749 Téléchargements

Partager

Gmail Facebook X LinkedIn More