Savas Ozkan, N. Sinem Gezer, Dmitrii Lachinov, Debdoot Sheet, Fabian Isensee, Gozde Bozdagi Akar, M. Alper Selver, Soumick Chatterjee, Oliver Speck, A. Emre Kavur, Sinem Aslan, Josef Pauli, Oğuz Dicle, Gozde Unal, Pierre-Henri Conze, Andreas Nürnberger, Klaus H. Maier-Hein, Gurbandurdy Dovletov, Ronnie Rajan, Vladimir Groza, Rachana Sathish, Bora Baydar, Matthias Perkonigg, Shuo Han, Philipp Ernst, Duc Duy Pham, Mustafa Baris, Dokuz Eylül Üniversitesi = Dokuz Eylül University [Izmir] (DEÜ), University of Ca’ Foscari [Venice, Italy], Département lmage et Traitement Information (IMT Atlantique - ITI), IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Laboratoire de Traitement de l'Information Medicale (LaTIM), Université de Brest (UBO)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre Hospitalier Régional Universitaire de Brest (CHRU Brest)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut Brestois Santé Agro Matière (IBSAM), Université de Brest (UBO), MEDIAN Technologies, University of Duisburg-Essen, Otto-von-Guericke University [Magdeburg] (OVGU), Middle East Technical University [Ankara] (METU), Medizinische Universität Wien = Medical University of Vienna, Johns Hopkins University (JHU), German Cancer Research Center - Deutsches Krebsforschungszentrum [Heidelberg] (DKFZ), Department of Biomedical Imaging and Image-guided Therapy [Medical University of Vienna], Indian Institute of Technology Kharagpur (IIT Kharagpur), and Istanbul Technical University (ITÜ)
Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) introduced new state-of-the-art segmentation systems. Despite outperforming the overall accuracy of existing systems, the effects of DL model properties and parameters on the performance are hard to interpret. This makes comparative analysis a necessary tool towards interpretable studies and systems. Moreover, the performance of DL for emerging learning approaches such as cross-modality and multi-modal semantic segmentation tasks has been rarely discussed. In order to expand the knowledge on these topics, the CHAOS – Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. Abdominal organ segmentation from routine acquisitions plays an important role in several clinical applications, such as pre-surgical planning or morphological and volumetric follow-ups for various diseases. These applications require a certain level of performance on a diverse set of metrics such as maximum symmetric surface distance (MSSD) to determine surgical error-margin or overlap errors for tracking size and shape differences. Previous abdomen related challenges are mainly focused on tumor/lesion detection and/or classification with a single modality. Conversely, CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks were designed to analyze the capabilities of participating approaches from multiple perspectives. The results were investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 ± 0.00 / 0.95 ± 0.01), but the best MSSD performance remains limited (21.89 ± 13.94 / 20.85 ± 10.63 mm). The performances of participating models decrease dramatically for cross-modality tasks both for the liver (DICE: 0.88 ± 0.15 MSSD: 36.33 ± 21.97 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs are observed to perform worse compared to organ-specific ones (performance drop around 5%). Nevertheless, some of the successful models show better performance with their multi-organ versions. We conclude that the exploration of those pros and cons in both single vs multi-organ and cross-modality segmentations is poised to have an impact on further research for developing effective algorithms that would support real-world clinical applications. Finally, having more than 1500 participants and receiving more than 550 submissions, another important contribution of this study is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomenon. © 2020 Elsevier B.V., 116E133, BIDEB-2214 College of Environmental Science and Forestry, State University of New York, ESF: 1059B191701102, BIDEB-2219, ZS/2016/08/80646 Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK, The organizers would like to thank Ivana Isgum and Tom Vercauteren in the challenge committee of ISBI 2019 for their guidance and support. We express our gratitude to supporting organizations of the grand-challenge.org platform. We thank Esranur Kazaz, Umut Baran Ekinci, Ece K?se, Fabian Isensee, David V?lgyes, and Javier Coronel for their contributions. Last but not least, our special thanks go to Ludmila I. Kuncheva for her valuable contributions. This work is supported by Scientific and Technological Research Council of Turkey (TUBITAK) ARDEB-EEEAG under grant number 116E133 and TUBITAK BIDEB-2214 International Doctoral Research Fellowship Programme. The work of P. Ernst, S. Chatterjee, O. Speck and, A. N?rnberger was conducted within the context of the International Graduate School MEMoRIAL at OvGU Magdeburg, Germany, supported by ESF (project no. ZS/2016/08/80646). The work of S. Aslan within the context of Ca? Foscari University of Venice is supported by under TUBITAK BIDEB-2219 grant no 1059B191701102., This work is supported by Scientific and Technological Research Council of Turkey (TUBITAK) ARDEB-EEEAG under grant number 116E133 and TUBITAK BIDEB-2214 International Doctoral Research Fellowship Programme. The work of P. Ernst, S. Chatterjee, O. Speck and, A. Nürnberger was conducted within the context of the International Graduate School MEMoRIAL at OvGU Magdeburg, Germany, supported by ESF (project no. ZS/2016/08/80646). The work of S. Aslan within the context of Ca’ Foscari University of Venice is supported by under TUBITAK BIDEB-2219 grant no 1059B191701102.