Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) chal- lenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Several uncertainty estimation methods have recently been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher per- centage of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmen- tation algorithms, highlighting the need for uncertainty quantification in medical image analyses. Finally, in favor of transparency and reproducibility, our evaluation code is made publicly available at https://github.com/RagMeh11/QU-BraTS. Research reported in this publication was partly supported by the Informatics Technology for Cancer Research (ITCR) program of the National Cancer Institute (NCI) of the National Institutes of Health (NIH), under award numbers NIH/NCI/ITCR:U01CA242871 and NIH/NCI/ITCR:U24CA189523. It was also partly supported by the National Institute of Neurological Disorders and Stroke (NINDS) of the NIH, under award number NIH/NINDS:R01NS042645, The content of this publication is solely the responsibility of the authors and does not represent the official views of the NIH. This work was supported by a Canadian Natural Science and Engineering Research Council (NSERC) Collaborative Research and Development Grant (CRDPJ 505357 - 16), Synaptive Medical, and the Canada Institute for Advanced Research (CIFAR) Artificial Intelligence Chairs program. Peer Reviewed Document signat per 92 autors/autores: Raghav Mehta1 , Angelos Filos2 , Ujjwal Baid3,4,5 , Chiharu Sako3,4 , Richard McKinley6 , Michael Rebsamen6 , Katrin D¨atwyler6,53, Raphael Meier54, Piotr Radojewski6 , Gowtham Krishnan Murugesan7 , Sahil Nalawade7 , Chandan Ganesh7 , Ben Wagner7 , Fang F. Yu7 , Baowei Fei8 , Ananth J. Madhuranthakam7,9 , Joseph A. Maldjian7,9 , Laura Daza10, Catalina Gómez10, Pablo Arbeláez10, Chengliang Dai11, Shuo Wang11, Hadrien Raynaud11, Yuanhan Mo11, Elsa Angelini12, Yike Guo11, Wenjia Bai11,13, Subhashis Banerjee14,15,16, Linmin Pei17, Murat AK17, Sarahi Rosas-González18, Illyess Zemmoura18,52, Clovis Tauber18 , Minh H. Vu19, Tufve Nyholm19, Tommy L¨ofstedt20, Laura Mora Ballestar21, Veronica Vilaplana21, Hugh McHugh22,23, Gonzalo Maso Talou24, Alan Wang22,24, Jay Patel25,26, Ken Chang25,26, Katharina Hoebel25,26, Mishka Gidwani25, Nishanth Arun25, Sharut Gupta25 , Mehak Aggarwal25, Praveer Singh25, Elizabeth R. Gerstner25, Jayashree Kalpathy-Cramer25 , Nicolas Boutry27, Alexis Huard27, Lasitha Vidyaratne28, Md Monibor Rahman28, Khan M. Iftekharuddin28, Joseph Chazalon29, Elodie Puybareau29, Guillaume Tochon29, Jun Ma30 , Mariano Cabezas31, Xavier Llado31, Arnau Oliver31, Liliana Valencia31, Sergi Valverde31 , Mehdi Amian32, Mohammadreza Soltaninejad33, Andriy Myronenko34, Ali Hatamizadeh34 , Xue Feng35, Quan Dou35, Nicholas Tustison36, Craig Meyer35,36, Nisarg A. Shah37, Sanjay Talbar38, Marc-Andr Weber39, Abhishek Mahajan48, Andras Jakab47, Roland Wiest6,46 Hassan M. Fathallah-Shaykh45, Arash Nazeri40, Mikhail Milchenko140,44, Daniel Marcus40,44 , Aikaterini Kotrotsou43, Rivka Colen43, John Freymann41,42, Justin Kirby41,42, Christos Davatzikos3,4 , Bjoern Menze49,50, Spyridon Bakas∗3,4,5 , Yarin Gal∗2 , Tal Arbel∗1,51 // 1Centre for Intelligent Machines (CIM), McGill University, Montreal, QC, Canada, 2Oxford Applied and Theoretical Machine Learning (OATML) Group, University of Oxford, Oxford, England, 3Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA, 4Department of Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA, 5Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA, 6Support Center for Advanced Neuroimaging (SCAN), University Institute of Diagnostic and Interventional Neuroradiology, University of Bern, Inselspital, Bern University Hospital, Bern, Switzerland, 7Department of Radiology, University of Texas Southwestern Medical Center, Dallas, TX, USA, 8Department of Bioengineering, University of Texas at Dallas, Texas, USA, 9Advanced Imaging Research Center, University of Texas Southwestern Medical Center, Dallas, TX, USA, 10Universidad de los Andes, Bogotá, Colombia, 11Data Science Institute, Imperial College London, London, UK, 12NIHR Imperial BRC, ITMAT Data Science Group, Imperial College London, London, UK, 13Department of Brain Sciences, Imperial College London, London, UK, 14Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India, 15Department of CSE, University of Calcutta, Kolkata, India, 16 Division of Visual Information and Interaction (Vi2), Department of Information Technology, Uppsala University, Uppsala, Sweden, 17Department of Diagnostic Radiology, The University of Pittsburgh Medical Center, Pittsburgh, PA, USA, 18UMR U1253 iBrain, Université de Tours, Inserm, Tours, France, 19Department of Radiation Sciences, Ume˚a University, Ume˚a, Sweden, 20Department of Computing Science, Ume˚a University, Ume˚a, Sweden, 21Signal Theory and Communications Department, Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain, 22Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand, 23Radiology Department, Auckland City Hospital, Auckland, New Zealand, 24Auckland Bioengineering Institute, University of Auckland, New Zealand, 25Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, USA, 26Massachusetts Institute of Technology, Cambridge, MA, USA, 27EPITA Research and Development Laboratory (LRDE), France, 28Vision Lab, Electrical and Computer Engineering, Old Dominion University, Norfolk, VA 23529, USA, 29EPITA Research and Development Laboratory (LRDE), Le Kremlin-Bicˆetre, France, 30School of Science, Nanjing University of Science and Technology, 31Research Institute of Computer Vision and Robotics, University of Girona, Spain, 32Department of Electrical and Computer Engineering, University of Tehran, Iran, 33School of Computer Science, University of Nottingham, UK, 34NVIDIA, Santa Clara, CA, US, 35Biomedical Engineering, University of Virginia, Charlottesville, USA, 36Radiology and Medical Imaging, University of Virginia, Charlottesville, USA, 37Department of Electrical Engineering, Indian Institute of Technology - Jodhpur, Jodhpur, India, 38SGGS ©2021 Mehta et al.. License: CC-BY 4.0. arXiv:2112.10074v1 [eess.IV] 19 Dec 2021 Mehta et al. Institute of Engineering and Technology, Nanded, India, 39Institute of Diagnostic and Interventional Radiology, Pediatric Radiology and Neuroradiology, University Medical Center, 40Department of Radiology, Washington University, St. Louis, MO, USA, 41Leidos Biomedical Research, Inc, Frederick National Laboratory for Cancer Research, Frederick, MD, USA, 42Cancer Imaging Program, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA, 43Department of Diagnostic Radiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA, 44Neuroimaging Informatics and Analysis Center, Washington University, St. Louis, MO, USA, 45Department of Neurology, The University of Alabama at Birmingham, Birmingham, AL, USA, 46Institute for Surgical Technology and Biomechanics, University of Bern, Bern, Switzerland, 47Center for MR-Research, University Children’s Hospital Zurich, Zurich, Switzerland, 48Tata Memorial Centre, Homi Bhabha National Institute, Mumbai, India, 49Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland, 50Department of Informatics, Technical University of Munich, Munich, Germany, 51MILA - Quebec Artificial Intelligence Institute, Montreal, QC, Canada, 52Neurosurgery department, CHRU de Tours, Tours, France, 53 Human Performance Lab, Schulthess Clinic, Zurich, Switzerland, 54 armasuisse S+T, Thun, Switzerland.