Back to Search Start Over

How precise are performance estimates for typical medical image segmentation tasks?

Authors :
Jurdi, Rosana
Colliot, Olivier
Algorithms, models and methods for images and signals of the human brain (ARAMIS)
Sorbonne Université (SU)-Inria de Paris
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut du Cerveau = Paris Brain Institute (ICM)
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP]
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP]
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)
IEEE
ANR-19-P3IA-0001,PRAIRIE,PaRis Artificial Intelligence Research InstitutE(2019)
ANR-10-IAHU-0006,IHU-A-ICM,Institut de Neurosciences Translationnelles de Paris(2010)
Source :
IEEE International Symposium on Biomedical Imaging (ISBI 2023), IEEE International Symposium on Biomedical Imaging (ISBI 2023), IEEE, Apr 2023, Cartagena de Indias, Colombia
Publication Year :
2023
Publisher :
HAL CCSD, 2023.

Abstract

International audience; An important issue in medical image processing is to be able to estimate not only the performances of algorithms but also the precision of the estimation of these performances. Reporting precision typically amounts to reporting standard-error of the mean (SEM) or equivalently confidence intervals. However, this is rarely done in medical image segmentation studies. In this paper, we aim to estimate what is the typical confidence that can be expected in such studies. To that end, we first perform experiments for Dice metric estimation using a standard deep learning model (U-net) and a classical task from the Medical Segmentation Decathlon. We extensively study precision estimation using both Gaussian assumption and bootstrapping (which does not require any assumption on the distribution). We then perform simulations for other test set sizes and performance spreads. Overall, our work shows that small test sets lead to wide confidence intervals (e.g. ∼8 points of Dice for 20 samples with σ ≃ 10).

Details

Language :
English
Database :
OpenAIRE
Journal :
IEEE International Symposium on Biomedical Imaging (ISBI 2023), IEEE International Symposium on Biomedical Imaging (ISBI 2023), IEEE, Apr 2023, Cartagena de Indias, Colombia
Accession number :
edsair.od.......165..f6c0194935646a23eef0a1ae6d05d984