Machicao, J., Ben Abbes, A., Meneguzzi, L., Corrêa, P. L. P., Specht, A., David, R., Subsol, G., Vellenich, D., Devillers, R., Stall, S., Mouquet, N., Chaumont, M., Berti‐Equille, L., and Mouillot, D.
The challenges of Reproducibility and Replicability (R & R) in computer science experiments have become a focus of attention in the last decade, as efforts to adhere to good research practices have increased. However, experiments using Deep Learning (DL) remain difficult to reproduce due to the complexity of the techniques used. Challenges such as estimating poverty indicators (e.g., wealth index levels) from remote sensing imagery, requiring the use of huge volumes of data across different geographic locations, would be impossible without the use of DL technology. To test the reproducibility of DL experiments, we report a review of the reproducibility of three DL experiments which analyze visual indicators from satellite and street imagery. For each experiment, we identify the challenges found in the data sets, methods and workflows used. As a result of this assessment we propose a checklist incorporating relevant FAIR principles to screen an experiment for its reproducibility. Based on the lessons learned from this study, we recommend a set of actions aimed to improve the reproducibility of such experiments and reduce the likelihood of wasted effort. We believe that the target audience is broad, from researchers seeking to reproduce an experiment, authors reporting an experiment, or reviewers seeking to assess the work of others. Plain Language Summary: This paper aims to help researchers understand the challenges of reproducing Deep Learning (DL) publications, mitigate reproducibility gaps, and make their own work more reproducible. We build on the work of others and add recommendations organized by (a) the quality of the data set (and associated metadata), (b) the DL methodology, (c) the implementation methodology, and the infrastructure used. To our knowledge, this is the first initiative of its kind to address the problem of reproducibility in remote sensing imagery and DL problems for real‐world tasks. We hope this paper lowers the barrier to entry for the DL community to improve research. Following the lifecycle mantra: reproduce!, then replicate! With the goal of improving reproducibility! Key Points: We discuss the reproducibility challenges faced in research by Deep Learning approaches using Big DataWe provide advice for pre‐screening papers (before experiments) to avoid poorly invested effortWe present a recipe with a set of mitigation strategies to address common errors users (researchers, authors, reviewers) may encounter [ABSTRACT FROM AUTHOR]