Header

UZH-Logo

Maintenance Infos

Machines Tuning Machines: Configuring Distributed Stream Processors with Bayesian Optimization


Fischer, Lorenz; Gao, Shen; Bernstein, Abraham (2015). Machines Tuning Machines: Configuring Distributed Stream Processors with Bayesian Optimization. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER 2015), Chicago, Illinois, USA, 8 September 2015 - 11 September 2015.

Abstract

Modern distributed computing frameworks such as Apache Hadoop, Spark, or Storm distribute the workload of applications across a large number of machines. Whilst they abstract the details of distribution they do require the programmer to set a number of configuration parameters before deployment. These parameter settings (usually) have a substantial impact on execution efficiency. Finding the right values for these parameters is considered a difficult task and requires domain, application, and framework expertise.
In this paper, we propose a machine learning approach to the problem of configuring a distributed computing framework. Specifically, we propose using Bayesian Optimization to find good parameter settings. In an extensive empirical evaluation, we show that Bayesian Optimization can effectively find good parameter settings for four different stream processing topologies implemented in Apache Storm resulting in significant gains over a parallel linear approach.

Abstract

Modern distributed computing frameworks such as Apache Hadoop, Spark, or Storm distribute the workload of applications across a large number of machines. Whilst they abstract the details of distribution they do require the programmer to set a number of configuration parameters before deployment. These parameter settings (usually) have a substantial impact on execution efficiency. Finding the right values for these parameters is considered a difficult task and requires domain, application, and framework expertise.
In this paper, we propose a machine learning approach to the problem of configuring a distributed computing framework. Specifically, we propose using Bayesian Optimization to find good parameter settings. In an extensive empirical evaluation, we show that Bayesian Optimization can effectively find good parameter settings for four different stream processing topologies implemented in Apache Storm resulting in significant gains over a parallel linear approach.

Statistics

Altmetrics

Downloads

199 downloads since deposited on 29 Oct 2015
85 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Event End Date:11 September 2015
Deposited On:29 Oct 2015 08:05
Last Modified:08 Dec 2017 14:27
Publisher:IEEE Computer Society
Publisher DOI:https://doi.org/10.1109/CLUSTER.2015.13
Other Identification Number:merlin-id:12241

Download

Download PDF  'Machines Tuning Machines: Configuring Distributed Stream Processors with Bayesian Optimization'.
Preview
Content: Accepted Version
Filetype: PDF
Size: 502kB
View at publisher