Back to Search Start Over

Comparative validation of multi-instance instrument segmentation in endoscopy: Results of the ROBUST-MIS 2019 challenge

Authors :
Pablo Arbeláez
Annika Reinke
Sabrina Kletz
Yueming Jin
Zhen-Liang Ni
Fabian Isensee
Cristina González
Sebastian Bodenstedt
Pål Halvorsen
Lena Maier-Hein
Ruohua Shi
Beat P. Müller-Stich
Hannes Kenngott
Martin Wagner
Yan-Jie Zhou
Kadir Kirtac
Manuel Wiesenfarth
Stefanie Speidel
Stefan Leger
Zhixuan Li
Thuy Nuong Tran
Tingting Jiang
Peter M. Full
Klaus H. Maier-Hein
Patrick Scholz
Laura Bravo-Sánchez
Hua-Bin Chen
Yujie Zhang
Lei Zhu
Annette Kopp-Schneider
Zeng-Guang Hou
Diana Mindroc-Filimon
Liansheng Wang
Gutai Wang
Enes Hosgor
Hellena Hempe
Tobias Roß
Jon Lindström Bolmgren
Pierangela Bruno
Martin Apitz
Michael Riegler
Gui-Bin Bian
Lu Wang
Pheng-Ann Heng
Michael Stenzel
Klaus Schoeffmann
Debesh Jha
Dong Guo
Jiacheng Wang
Isabell Twick
Publication Year :
2020
Publisher :
Elsevier, 2020.

Abstract

Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. While numerous methods for detecting, segmenting and tracking of medical instruments based on endoscopic video images have been proposed in the literature, key limitations remain to be addressed: Firstly, robustness, that is, the reliable performance of state-of-the-art methods when run on challenging images (e.g. in the presence of blood, smoke or motion artifacts). Secondly, generalization; algorithms trained for a specific intervention in a specific hospital should generalize to other interventions or institutions. In an effort to promote solutions for these limitations, we organized the Robust Medical Instrument Segmentation (ROBUST-MIS) challenge as an international benchmarking competition with a specific focus on the robustness and generalization capabilities of algorithms. For the first time in the field of endoscopic image processing, our challenge included a task on binary segmentation and also addressed multi-instance detection and segmentation. The challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures from three different types of surgery. The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap. While the average detection and segmentation quality of the best-performing algorithms is high, future research should concentrate on detection and segmentation of small, crossing, moving and transparent instrument(s) (parts).

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....015a53cfd5f5fa1877b58b11efc5bae6