Header

UZH-Logo

Maintenance Infos

Defect prediction as a multiobjective optimization problem


Canfora, Gerardo; Lucia, Andrea De; Penta, Massimiliano Di; Oliveto, Rocco; Panichella, Annibale; Panichella, Sebastiano (2015). Defect prediction as a multiobjective optimization problem. Software Testing, Verification and Reliability, 25(4):426-459.

Abstract

In this paper, we formalize the defect-prediction problem as a multiobjective optimization problem. Specifically, we propose an approach, coined as multiobjective defect predictor (MODEP), based on multi-objective forms of machine learning techniques—logistic regression and decision trees specifically—trained using a genetic algorithm. The multiobjective approach allows software engineers to choose predictors achieving a specific compromise between the number of likely defect-prone classes or the number of defects that the analysis would likely discover (effectiveness), and lines of code to be analysed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the PROMISE repository indicate the quantitative superiority of MODEP with respect to single-objective predictors, and with respect to trivial baseline ranking classes by size in ascending or descending order. Also, MODEP outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.

Abstract

In this paper, we formalize the defect-prediction problem as a multiobjective optimization problem. Specifically, we propose an approach, coined as multiobjective defect predictor (MODEP), based on multi-objective forms of machine learning techniques—logistic regression and decision trees specifically—trained using a genetic algorithm. The multiobjective approach allows software engineers to choose predictors achieving a specific compromise between the number of likely defect-prone classes or the number of defects that the analysis would likely discover (effectiveness), and lines of code to be analysed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the PROMISE repository indicate the quantitative superiority of MODEP with respect to single-objective predictors, and with respect to trivial baseline ranking classes by size in ascending or descending order. Also, MODEP outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.

Statistics

Citations

Dimensions.ai Metrics
15 citations in Web of Science®
18 citations in Scopus®
19 citations in Microsoft Academic
Google Scholar™

Altmetrics

Downloads

2 downloads since deposited on 03 Jun 2015
0 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Date:1 June 2015
Deposited On:03 Jun 2015 14:56
Last Modified:14 Feb 2018 09:11
Publisher:Wiley-Blackwell Publishing, Inc.
ISSN:0960-0833
OA Status:Closed
Publisher DOI:https://doi.org/10.1002/stvr.1570
Other Identification Number:merlin-id:11978

Download