Mapreduce frameworks: Comparing hadoop and HPCC

Fabian Fier, Johan-Christoph Freytag

Research output: Contribution to journalArticle

Abstract

MapReduce and Hadoop are often used synonymously. For optimal runtime performance, Hadoop users have to consider various implementation details and configuration parameters. When conducting performance experiments with Hadoop on different algorithms, it is hard to choose a set of such implementation optimizations and configuration options which is fair to all algorithms. By fair we mean default configurations and automatic optimizations provided by the execution system which ideally do not require manual intervention. HPCC is a promising alternative open source implementation of MapReduce. We show that HPCC provides sensible default configuration values allowing for fairer experimental comparisons. On the other hand, we show that HPCC users still have to consider implementing optimizations known from Hadoop.
Original languageAmerican English
JournalCEUR Workshop Proceedings
Publication statusPublished - Jun 1 2017

    Fingerprint

Cite this