FP-Hadoop

Miguel Liroz-Gistau 1 Reza Akbarinia 1 Patrick Valduriez 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
FP-Hadoop makes the reduce side of Hadoop MapReduce more parallel and efficiently deals with the problem of data skew in the reduce side. In FP-Hadoop, there is a new phase, called intermediate reduce (IR), in which blocks of intermediate values, constructed dynamically, are processed by intermediate reduce workers in parallel. Our experiments using FP-Hadoop using synthetic and real benchmarks have shown excellent performance gains compared to native Hadoop, e.g. more than 10 times in reduce time and 5 times in total execution time.
Document type :
Software
Complete list of metadatas

https://hal.inria.fr/hal-02093002
Contributor : Reza Akbarinia <>
Submitted on : Monday, April 8, 2019 - 3:45:23 PM
Last modification on : Wednesday, June 26, 2019 - 4:29:42 PM

Collections

Citation

Miguel Liroz-Gistau, Reza Akbarinia, Patrick Valduriez. FP-Hadoop. 2019. ⟨hal-02093002⟩

Share

Metrics

Record views

80