Header

UZH-Logo

Maintenance Infos

Branch Coverage Prediction in Automated Testing


Grano, Giovanni; Titov, Timofey V.; Panichella, Sebastiano; Gall, Harald C. (2019). Branch Coverage Prediction in Automated Testing. Journal of Software: Evolution and Process, 31(9):1-22.

Abstract

Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed and, moreover, new test cases should be generated for the new source code.This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by test-case generation.We argue that knowing a priori the branch-coverage that can be achieved with test-data generation tools can help developers into taking informed-decision about those issues. In this paper, we investigate the possibility to use source-code metricsto predict the coverage achieved by test-data generation tools.
We use four different categories of source-code features and assess the prediction on a large dataset involving more than 3'000 Java classes.
We compare different machine learning algorithms and conduct a fine-grained feature analysis aimed at investigating the factors that most impact the prediction accuracy.
Moreover, we extend our investigation to four different search-budgets.
Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested cross-validation over the different budgets, respectively on EvoSuite and Randoop. Finally, the discussion of the results demonstrate the relevance of coupling-related features for the prediction accuracy.

Abstract

Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed and, moreover, new test cases should be generated for the new source code.This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by test-case generation.We argue that knowing a priori the branch-coverage that can be achieved with test-data generation tools can help developers into taking informed-decision about those issues. In this paper, we investigate the possibility to use source-code metricsto predict the coverage achieved by test-data generation tools.
We use four different categories of source-code features and assess the prediction on a large dataset involving more than 3'000 Java classes.
We compare different machine learning algorithms and conduct a fine-grained feature analysis aimed at investigating the factors that most impact the prediction accuracy.
Moreover, we extend our investigation to four different search-budgets.
Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested cross-validation over the different budgets, respectively on EvoSuite and Randoop. Finally, the discussion of the results demonstrate the relevance of coupling-related features for the prediction accuracy.

Statistics

Citations

Dimensions.ai Metrics
2 citations in Web of Science®
5 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

20 downloads since deposited on 15 Mar 2019
15 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Software
Language:English
Date:1 September 2019
Deposited On:15 Mar 2019 09:57
Last Modified:29 Jul 2020 10:23
Publisher:Wiley-Blackwell Publishing, Inc.
ISSN:2047-7481
OA Status:Green
Publisher DOI:https://doi.org/10.1002/smr.2158
Related URLs:https://onlinelibrary.wiley.com/doi/epdf/10.1002/smr.2158 (Publisher)
Other Identification Number:merlin-id:17662

Download

Green Open Access

Download PDF  'Branch Coverage Prediction in Automated Testing'.
Preview
Content: Accepted Version
Language: English
Filetype: PDF
Size: 622kB
View at publisher