MATIS: Masked-Attention Transformers for Surgical Instrument Segmentation

Authors :: Ayobi, Nicolás
Pérez-Rondón, Alejandra
Rodríguez, Santiago
Arbeláez, Pablo
Source :: 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), 10230819
Publication Year :: 2023
Abstract: We propose Masked-Attention Transformers for Surgical Instrument Segmentation (MATIS), a two-stage, fully transformer-based method that leverages modern pixel-wise attention mechanisms for instrument segmentation. MATIS exploits the instance-level nature of the task by employing a masked attention module that generates and classifies a set of fine instrument region proposals. Our method incorporates long-term video-level information through video transformers to improve temporal consistency and enhance mask classification. We validate our approach in the two standard public benchmarks, Endovis 2017 and Endovis 2018. Our experiments demonstrate that MATIS' per-frame baseline outperforms previous state-of-the-art methods and that including our temporal consistency module boosts our model's performance further.<br />Comment: ISBI 2023 (Oral). Winning method of the 2022 SAR-RARP50 Challenge (arXiv:2401.00496). Official extension published at arXiv:2401.11174 . Code available at https://github.com/BCV-Uniandes/MATIS

Database :: arXiv
Journal :: 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), 10230819
Publication Type :: Report
Accession number :: edsarx.2303.09514
Document Type :: Working Paper
Full Text :: https://doi.org/10.1109/ISBI53787.2023.10230819