Code ownership is a common practice in large, distributed software development teams. It is used to establish a chain of responsibility (who to blame if there is a problem) and simplify management (to whom a task or bug-fix should be assigned). A simple intuition for estimating code ownership is that the developer who has written majority code to a module should be an owner of that module. Moreover, prior research found that a module with weak code ownership (that is written by many minor authors) is more likely to have bugs in the future.
Nowadays, development practices are more than just writing code. A tool-based code review process has tightly integrated with the software development cycle. Recent research has found that in addition to a defect-hunting exercise, reviewers also help an author to improve the code changes. Then, these code writing and reviewing activities are orthogonal: teams can have a developer who reviews a lot but writes little and vice versa.
Does code review activity change what we know about ownership and software quality? This led us to investigate the importance of code review activities for code ownership and software quality. Through an empirical study of Qt and OpenStack systems, we (1) investigated the code authoring and reviewing activities of developers, (2) refined code ownership using code reviewing activities, and (3) studied the relationship between our refined ownership and software quality.
Code reviewers are the majority of contributors in a module
We found that the developers who did not previously write any code changes but only reviewed code changes are in the largest proportion of developers who contributed to a module (67%-86% of contributors in a module at the median). Moreover, 18%-50% of these review-only developers are documented core developers of the Qt and OpenStack projects. These findings suggest that if a code ownership estimation considers only code authoring activities, it is missing many developers who also provided reviewing contributions to a module.
Many minor authors are actually major reviewers
We observe the amount of code authoring and reviewing contributions that developers made to a module and classify two levels of expertise i.e., major and minor levels in each dimension (see the figure below).
Traditional code ownership (TCO) is solely derived from code authoring activities while review-specific ownership (RSO) is solely derived from code reviewing activities. The interesting part of this figure is the minor authors since these developers were classified as low-expertise developers.However, we found that 13%-58% of minor authors are major reviewers who actually reviewed many code changes to a module. This finding suggests that many major developers who actually make large contributions to modules by reviewing code changes were misclassified as low-expertise developers according to their low code authoring activities.
Reviewing expertise reverses the relationship between authoring expertise and software quality
We further investigated whether reviewing expertise has an impact on software quality or not. Hence, we compared the rates of developers with each level of expertise in between defective and clean modules. We found that the rates of developers with the minor author and minor reviewer expertise in defective modules are higher than those in clean modules (The left bean plot in the figure below). On the other hand, the rates of developers with the minor author but major reviewer expertise in defective modules are less than those in clean modules (The right bean plot in the figure below). When we control for several confounding factors using statistical models, the rates of developers with the minor author and minor reviewer expertise still share a strong increasing relationship with defect-proneness. These results indicate that the reviewing expertise share a relationship with software quality and it can reverse the direction of the association between the minor authorship and defect-proneness.
Results of our publication at ICSE2016 lead us to believe that code reviewing activity captures an important aspect of code ownership. Therefore, future estimations of code ownership should take code review activity into consideration in order to accurately model the contributions that developers have made to evolve software systems. Such code ownership estimations also can be used to chart quality improvement plans. For example, teams should apply additional scrutiny to module contributions from developers who have neither authored nor reviewed many code changes to that module in the past, while a module with many developers who have not authored many code changes should not be considered risky if those developers have reviewed many of the code changes to that module.