Back to Search Start Over

Redundancy, Context, and Preference: An Empirical Study of Duplicate Pull Requests in OSS Projects

Authors :
Yue Yu
Tao Wang
Minghui Zhou
Huaimin Wang
Gang Yin
Zhixing Li
Long Lan
Source :
IEEE Transactions on Software Engineering. 48:1309-1335
Publication Year :
2022
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2022.

Abstract

OSS projects are being developed by globally distributed contributors, who often collaborate through the pull-based model today. While this model lowers the barrier to entry for OSS developers by synthesizing, automating and optimizing the contribution process, coordination among an increasing number of contributors remains as a challenge due to the asynchronous and self-organized nature of distributed development. In particular, duplicate contributions, where multiple different contributors unintentionally submit duplicate pull requests to achieve the same goal, are an elusive problem that may waste effort in automated testing, code review and software maintenance. While the issue of duplicate pull requests has been highlighted, to what extent duplicate pull requests affect the development in OSS communities has not been well investigated. In this paper, we conduct a mixed-approach study to bridge this gap. Based on a comprehensive dataset constructed from 26 popular GitHub projects, we obtain the following findings: (a) Duplicate pull requests result in redundant human and computing resources, exerting a significant impact on the contribution and evaluation process. (b) Contributors' inappropriate working patterns and the drawbacks of their collaborating environment might result in duplicate pull requests. (c) Compared to non-duplicate pull requests, duplicate pull requests have significantly different features, e.g., being submitted by inexperienced contributors, being fixing bugs, touching cold files, and solving tracked issues. (d) Integrators choosing between duplicate pull requests prefer to accept those with early submission time, accurate and high-quality implementation, broad coverage, test code, high maturity, deep discussion, and active response. Finally, actionable suggestions and implications are proposed for OSS practitioners.

Details

ISSN :
23263881 and 00985589
Volume :
48
Database :
OpenAIRE
Journal :
IEEE Transactions on Software Engineering
Accession number :
edsair.doi...........851999c7c1db4d049e316a01c0b9918c
Full Text :
https://doi.org/10.1109/tse.2020.3018726