Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Predicting unstable software benchmarks using static source code features

Laaber, Christoph; Basmaci, Mikael; Salza, Pasquale (2021). Predicting unstable software benchmarks using static source code features. Empirical Software Engineering, 26(6):114.

Abstract

Software benchmarks are only as good as the performance measurements they yield. Unstable benchmarks show high variability among repeated measurements, which causes uncertainty about the actual performance and complicates reliable change assessment. However, if a benchmark is stable or unstable only becomes evident after it has been executed and its results are available. In this paper, we introduce a machine-learning-based approach to predict a benchmark’s stability without having to execute it. Our approach relies on 58 statically-computed source code features, extracted for benchmark code and code called by a benchmark, related to (1) meta information, e.g., lines of code (LOC), (2) programming language elements, e.g., conditionals or loops, and (3) potentially performance-impacting standard library calls, e.g., file and network input/output (I/O). To assess our approach’s effectiveness, we perform a large-scale experiment on 4,461 Go benchmarks coming from 230 open-source software (OSS) projects. First, we assess the prediction performance of our machine learning models using 11 binary classification algorithms. We find that Random Forest performs best with good prediction performance from 0.79 to 0.90, and 0.43 to 0.68, in terms of AUC and MCC, respectively. Second, we perform feature importance analyses for individual features and feature categories. We find that 7 features related to meta-information, slice usage, nested loops, and synchronization application programming interfaces (APIs) are individually important for good predictions; and that the combination of all features of the called source code is paramount for our model, while the combination of features of the benchmark itself is less important. Our results show that although benchmark stability is affected by more than just the source code, we can effectively utilize machine learning models to predict whether a benchmark will be stable or not ahead of execution. This enables spending precious testing time on reliable benchmarks, supporting developers to identify unstable benchmarks during development, allowing unstable benchmarks to be repeated more often, estimating stability in scenarios where repeated benchmark execution is infeasible or impossible, and warning developers if new benchmarks or existing benchmarks executed in new environments will be unstable.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Software
Uncontrolled Keywords:Software
Scope:Discipline-based scholarship (basic research)
Language:English
Date:1 November 2021
Deposited On:14 Oct 2022 12:25
Last Modified:27 Mar 2025 02:36
Publisher:Springer
ISSN:1382-3256
OA Status:Hybrid
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1007/s10664-021-09996-y
Other Identification Number:merlin-id:22828
Project Information:
  • Funder: Universität Zürich
  • Grant ID:
  • Project Title:
Download PDF  'Predicting unstable software benchmarks using static source code features'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
10 citations in Web of Science®
17 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

21 downloads since deposited on 14 Oct 2022
5 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications