Back to Search
Start Over
Redundancy, Context, and Preference: An Empirical Study of Duplicate Pull Requests in OSS Projects
- Source :
- IEEE Transactions on Software Engineering. 48:1309-1335
- Publication Year :
- 2022
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2022.
-
Abstract
- OSS projects are being developed by globally distributed contributors, who often collaborate through the pull-based model today. While this model lowers the barrier to entry for OSS developers by synthesizing, automating and optimizing the contribution process, coordination among an increasing number of contributors remains as a challenge due to the asynchronous and self-organized nature of distributed development. In particular, duplicate contributions, where multiple different contributors unintentionally submit duplicate pull requests to achieve the same goal, are an elusive problem that may waste effort in automated testing, code review and software maintenance. While the issue of duplicate pull requests has been highlighted, to what extent duplicate pull requests affect the development in OSS communities has not been well investigated. In this paper, we conduct a mixed-approach study to bridge this gap. Based on a comprehensive dataset constructed from 26 popular GitHub projects, we obtain the following findings: (a) Duplicate pull requests result in redundant human and computing resources, exerting a significant impact on the contribution and evaluation process. (b) Contributors' inappropriate working patterns and the drawbacks of their collaborating environment might result in duplicate pull requests. (c) Compared to non-duplicate pull requests, duplicate pull requests have significantly different features, e.g., being submitted by inexperienced contributors, being fixing bugs, touching cold files, and solving tracked issues. (d) Integrators choosing between duplicate pull requests prefer to accept those with early submission time, accurate and high-quality implementation, broad coverage, test code, high maturity, deep discussion, and active response. Finally, actionable suggestions and implications are proposed for OSS practitioners.
- Subjects :
- Code review
Computer science
020207 software engineering
02 engineering and technology
Software maintenance
computer.software_genre
Data science
Empirical research
Test code
Asynchronous communication
0202 electrical engineering, electronic engineering, information engineering
Redundancy (engineering)
Distributed development
computer
Software
Barriers to entry
Subjects
Details
- ISSN :
- 23263881 and 00985589
- Volume :
- 48
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Software Engineering
- Accession number :
- edsair.doi...........851999c7c1db4d049e316a01c0b9918c
- Full Text :
- https://doi.org/10.1109/tse.2020.3018726