Header

UZH-Logo

Maintenance Infos

When process data quality affects the number of bugs: correlations in software engineering datasets


Bernstein, Adrian; Bachmann, Abraham (2010). When process data quality affects the number of bugs: correlations in software engineering datasets. In: MSR '10: 7th IEEE Working Conference on Mining Software Repositories, Cape Town, South Africa, 2010 - 2010, 62-71.

Abstract

Software engineering process information extracted from version control systems and bug tracking databases are widely used in empirical software engineering. In prior work, we showed that these data are plagued by quality deficiencies, which vary in its characteristics across projects. In addition, we showed that those deficiencies in the form of bias do impact the results of studies in empirical software engineering. While these findings affect software engineering researchers the impact on practitioners has not yet been substantiated. In this paper we, therefore, explore (i) if the process data quality and characteristics have an influence on the bug fixing process and (ii) if the process quality as measured by the process data has an influence on the product (i.e., software) quality. Specifically, we analyze six Open Source as well as two Closed Source projects and show that process data quality and characteristics have an impact on the bug fixing process: the high rate of empty commit messages in Eclipse, for example, correlates with the bug report quality. We also show that the product quality -- measured by number of bugs reported -- is affected by process data quality measures. These findings have the potential to prompt practitioners to increase the quality of their software process and its associated data quality.

Abstract

Software engineering process information extracted from version control systems and bug tracking databases are widely used in empirical software engineering. In prior work, we showed that these data are plagued by quality deficiencies, which vary in its characteristics across projects. In addition, we showed that those deficiencies in the form of bias do impact the results of studies in empirical software engineering. While these findings affect software engineering researchers the impact on practitioners has not yet been substantiated. In this paper we, therefore, explore (i) if the process data quality and characteristics have an influence on the bug fixing process and (ii) if the process quality as measured by the process data has an influence on the product (i.e., software) quality. Specifically, we analyze six Open Source as well as two Closed Source projects and show that process data quality and characteristics have an impact on the bug fixing process: the high rate of empty commit messages in Eclipse, for example, correlates with the bug report quality. We also show that the product quality -- measured by number of bugs reported -- is affected by process data quality measures. These findings have the potential to prompt practitioners to increase the quality of their software process and its associated data quality.

Statistics

Citations

Downloads

50 downloads since deposited on 24 Feb 2011
7 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Event End Date:2010
Deposited On:24 Feb 2011 10:10
Last Modified:12 Aug 2017 07:17
Other Identification Number:1371

Download

Download PDF  'When process data quality affects the number of bugs: correlations in software engineering datasets'.
Preview
Filetype: PDF
Size: 1MB