ECL-watch: A big data application performance tuning tool in the HPCC systems platform: A big data application performance tuning tool in the HPCC systems platform

Lili Xu, Edin Muharemagc, Amy Apon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

The proliferation of Big Data processing environments such as Hadoop, Apache Spark, and HPCC Systems is driving the development of performance analysis tools in these distributed systems. The goal is to achieve high performance through the optimization of Big Data applications. However, tuning performance in a fine-grained manner is quite challenging due to the high complexity and massive size of the distributed systems. ECL-Watch is a data-flow based fine-grained comprehensive Big Data performance analysis tool utilizing the high level declarative dataflow programming language ECL in HPCC Systems. As a case study, we implement and optimize the Yinyang K-Means machine learning algorithm in ECL in HPCC Systems. The experimental results show that the performance of the native ECL version of the Yinyang K-Means algorithm increased significantly after tuning: from being about three times slower than the standard K-Means implementation in ECL, to become roughly 15% faster than standard K-Means.

Original languageAmerican English
Title of host publicationIEEE International Conference
EditorsJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2941-2950
Number of pages10
ISBN (Electronic)9781538627143
DOIs
StatePublished - 2017
Event5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States
Duration: Dec 11 2017Dec 14 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
Volume2018-January

Conference

Conference5th IEEE International Conference on Big Data, Big Data 2017
Country/TerritoryUnited States
CityBoston
Period12/11/1712/14/17

Keywords

  • Big Data
  • Distributed Computing
  • HPCC Systems
  • Machine Learning
  • Performance Analysis
  • Tuning and Optimization

Fingerprint

Dive into the research topics of 'ECL-watch: A big data application performance tuning tool in the HPCC systems platform: A big data application performance tuning tool in the HPCC systems platform'. Together they form a unique fingerprint.

Cite this