An Eye on the Elephant in the Wild: A Performance Evaluation of Hadoop's Schedulers Under Failures - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

An Eye on the Elephant in the Wild: A Performance Evaluation of Hadoop's Schedulers Under Failures

Résumé

Large-scale data analysis has increasingly come to rely on MapReduce and its open-source implementation Hadoop. Recently, Hadoop has not only been used for running single batch jobs but it has also been optimized to simultaneously support the execution of multiple jobs belonging to multiple concurrent users. Several schedulers (i.e., Fifo, Fair, and Capacity schedulers) have been proposed to optimize locality executions of tasks but do not consider failures, although, evidence in the literature shows that faults do occur and can probably result in performance problems. In this paper, we have designed a set of experiments to evaluate the performance of Hadoop under failure when applying several schedulers (i.e., explore the conflict between job scheduling, exposing locality executions, and failures). Our results reveal several drawbacks of current Hadoop's mechanism in prioritizing failed tasks. By trying to launch failed tasks as soon as possible regardless of locality, it significantly increases the execution time of jobs with failed tasks, due to two reasons: 1) available resources might not be freed up as quickly as expected and 2) failed tasks might be re-executed on machines with no data on it, introducing extra cost for data transferring through network, which is normally the most scarce resource in today's data-centers. Our preliminary study with Hadoop not only helps us to understand the interplay between fault-tolerance and job scheduling, but also offers useful insights into optimizing the current schedulers to be more efficient in case of failures.
Fichier principal
Vignette du fichier
ARMS-CC2015.pdf (332.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01184236 , version 1 (13-08-2015)

Identifiants

  • HAL Id : hal-01184236 , version 1

Citer

Shadi Ibrahim, Tran Anh Phuong, Gabriel Antoniu. An Eye on the Elephant in the Wild: A Performance Evaluation of Hadoop's Schedulers Under Failures. ARMS-CC'15-The second workshop on Adaptive Resource Management and Scheduling for Cloud Computing, held in conjunction with PODC 2015,, Jul 2015, Donostia-San Sebastián, Spain. ⟨hal-01184236⟩
493 Consultations
344 Téléchargements

Partager

Gmail Facebook X LinkedIn More