Back to Search Start Over

Pixel-Wise Recognition for Holistic Surgical Scene Understanding

Authors :
Ayobi, Nicolás
Rodríguez, Santiago
Pérez, Alejandra
Hernández, Isabela
Aparicio, Nicolás
Dessevres, Eugénie
Peña, Sebastián
Santander, Jessica
Caicedo, Juan Ignacio
Fernández, Nicolás
Arbeláez, Pablo
Publication Year :
2024

Abstract

This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach enables a multi-level comprehension of surgical activities, encompassing long-term tasks such as surgical phases and steps recognition and short-term tasks including surgical instrument segmentation and atomic visual actions detection. To exploit our proposed benchmark, we introduce the Transformers for Actions, Phases, Steps, and Instrument Segmentation (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals from an instrument segmentation model to tackle the multi-granularity of our benchmark. Through extensive experimentation, we demonstrate the impact of including segmentation annotations in short-term recognition tasks, highlight the varying granularity requirements of each task, and establish TAPIS's superiority over previously proposed baselines and conventional CNN-based models. Additionally, we validate the robustness of our method across multiple public benchmarks, confirming the reliability and applicability of our dataset. This work represents a significant step forward in Endoscopic Vision, offering a novel and comprehensive framework for future research towards a holistic understanding of surgical procedures.<br />Comment: Preprint submitted to Medical Image Analysis. Official extension of previous MICCAI 2022 (https://link.springer.com/chapter/10.1007/978-3-031-16449-1_42) and ISBI 2023 (https://ieeexplore.ieee.org/document/10230819) orals. Data and codes are available at https://github.com/BCV-Uniandes/GraSP

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2401.11174
Document Type :
Working Paper