1. Better together: Elements of successful scientific software development in a distributed collaborative community
- Author
-
Steven M. Lewis, Andrew M. Watkins, David Baker, Rocco Moretti, Tanja Kortemme, Ora Schueler-Furman, Jared Adolf-Bryfogle, William R. Schief, Jeffrey J. Gray, P. Douglas Renfrew, Vikram Khipple Mulligan, Jason W. Labonte, Sergey Lyskov, Christopher Bystroff, Brian D. Weitzner, Justyna Krys, Dominik Gront, Julia Koehler Leman, Philip Bradley, Brian Kuhlman, Jens Meiler, Roland L. Dunbrack, Andrew Leaver-Fay, Richard Bonneau, and Charlie E. M. Strauss
- Subjects
0301 basic medicine ,Data Analysis ,Models, Molecular ,Science and Technology Workforce ,Computer science ,Review ,Protein Structure Prediction ,Careers in Research ,Biochemistry ,User-Computer Interface ,0302 clinical medicine ,Software ,Engineering ,Electronics Engineering ,Macromolecular Structure Analysis ,Computer Engineering ,Biology (General) ,Cooperative Behavior ,Software suite ,Ecology ,Software Development ,Software Engineering ,Subject (documents) ,Research Personnel ,Professions ,Computational Theory and Mathematics ,Modeling and Simulation ,Engineering and Technology ,Computer and Information Sciences ,Protein Structure ,Source lines of code ,QH301-705.5 ,Science Policy ,Maintainability ,Computer Software ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Genetics ,Humans ,Social Behavior ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Gene Library ,business.industry ,Software Tools ,Research ,Software development ,Computational Biology ,Biology and Life Sciences ,Proteins ,Data science ,Subject-matter expert ,030104 developmental biology ,Sustainability ,People and Places ,Scientists ,Population Groupings ,business ,030217 neurology & neurosurgery - Abstract
Many scientific disciplines rely on computational methods for data analysis, model generation, and prediction. Implementing these methods is often accomplished by researchers with domain expertise but without formal training in software engineering or computer science. This arrangement has led to underappreciation of sustainability and maintainability of scientific software tools developed in academic environments. Some software tools have avoided this fate, including the scientific library Rosetta. We use this software and its community as a case study to show how modern software development can be accomplished successfully, irrespective of subject area. Rosetta is one of the largest software suites for macromolecular modeling, with 3.1 million lines of code and many state-of-the-art applications. Since the mid 1990s, the software has been developed collaboratively by the RosettaCommons, a community of academics from over 60 institutions worldwide with diverse backgrounds including chemistry, biology, physiology, physics, engineering, mathematics, and computer science. Developing this software suite has provided us with more than two decades of experience in how to effectively develop advanced scientific software in a global community with hundreds of contributors. Here we illustrate the functioning of this development community by addressing technical aspects (like version control, testing, and maintenance), community-building strategies, diversity efforts, software dissemination, and user support. We demonstrate how modern computational research can thrive in a distributed collaborative community. The practices described here are independent of subject area and can be readily adopted by other software development communities.
- Published
- 2020