Abstract
In code reviews, developers examine code changes authored by peers and provide feedback through comments. Despite the importance of these comments, no accepted approach currently exists for assessing their quality. Therefore, this study has two main objectives: (1) to devise a conceptual model for an explainable evaluation of review comment quality, and (2) to develop models for the automated evaluation of comments according to the conceptual model. To do so, we conduct mixed-method studies and propose a new approach: EvaCRC (Evaluating Code Review Comments). To achieve the first goal, we collect and synthesize quality attributes of review comments, by triangulating data from both authoritative documentation on code review standards and academic literature. We then validate these attributes using real-world instances. Finally, we establish mappings between quality attributes and grades by inquiring domain experts, thus defining our final explainable conceptual model. To achieve the second goal, EvaCRC leverages multi-label learning. To evaluate and refine EvaCRC, we conduct an industrial case study with a global ICT enterprise. The results indicate that EvaCRC can effectively evaluate review comments while offering reasons for the grades.