Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Tracking concept drift of software projects using defect prediction quality

Ekanayake, J; Tappolet, J; Gall, H C; Bernstein, A (2009). Tracking concept drift of software projects using defect prediction quality. In: 6th IEEE Working Conference on Mining Software Repositories, Vancouver, Canada, May 2009.

Abstract

Defect prediction is an important task in the mining of software repositories, but the quality of predictions varies
strongly within and across software projects. In this paper
we investigate the reasons why the prediction quality is so
fluctuating due to the altering nature of the bug (or defect) fixing process. Therefore, we adopt the notion of a concept drift, which denotes that the defect prediction model has become unsuitable as set of influencing features has changed – usually due to a change in the underlying bug generation process (i.e., the concept). We explore four open source projects (Eclipse, OpenOffice, Netbeans and Mozilla) and construct file-level and project-level features for each of them from their respective CVS and Bugzilla repositories.
We then use this data to build defect prediction models and
visualize the prediction quality along the time axis. These
visualizations allow us to identify concept drifts and – as a consequence – phases of stability and instability expressed in the level of defect prediction quality. Further, we identify those project features, which are influencing the defect prediction quality using both a tree induction-algorithm and a linear regression model. Our experiments uncover that software systems are subject to considerable concept drifts in their evolution history. Specifically, we observe that the change in number of authors editing a file and the number of defects fixed by them contribute to a project’s concept drift and therefore influence the defect prediction quality.
Our findings suggest that project managers using defect
prediction models for decision making should be aware of
the actual phase of stability or instability due to a potential concept drift.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Computer Science Applications
Physical Sciences > Software
Scope:Discipline-based scholarship (basic research)
Language:English
Event End Date:May 2009
Deposited On:04 Feb 2010 11:25
Last Modified:06 Mar 2024 13:57
OA Status:Green
Publisher DOI:https://doi.org/10.1109/MSR.2009.5069480
Other Identification Number:merlin-id:220

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
43 citations in Web of Science®
61 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

250 downloads since deposited on 04 Feb 2010
34 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications