Back to Search Start Over

Trans-SVNet: hybrid embedding aggregation Transformer for surgical workflow analysis

Authors :
Jin, Yueming
Long, Yonghao
Gao, Xiaojie
Stoyanov, Danail
Dou, Qi
Heng, Pheng-Ann
Source :
International Journal of Computer Assisted Radiology and Surgery; December 2022, Vol. 17 Issue: 12 p2193-2202, 10p
Publication Year :
2022

Abstract

Purpose: Real-time surgical workflow analysis has been a key component for computer-assisted intervention system to improve cognitive assistance. Most existing methods solely rely on conventional temporal models and encode features with a successive spatial–temporal arrangement. Supportive benefits of intermediate features are partially lost from both visual and temporal aspects. In this paper, we rethink feature encoding to attend and preserve the critical information for accurate workflow recognition and anticipation. Methods: We introduce Transformer in surgical workflow analysis, to reconsider complementary effects of spatial and temporal representations. We propose a hybrid embedding aggregation Transformer, named Trans-SVNet, to effectively interact with the designed spatial and temporal embeddings, by employing spatial embedding to query temporal embedding sequence. We jointly optimized by loss objectives from both analysis tasks to leverage their high correlation. Results: We extensively evaluate our method on three large surgical video datasets. Our method consistently outperforms the state-of-the-arts across three datasets on workflow recognition task. Jointly learning with anticipation, recognition results can gain a large improvement. Our approach also shows its effectiveness on anticipation with promising performance achieved. Our model achieves a real-time inference speed of 0.0134 second per frame. Conclusion: Experimental results demonstrate the efficacy of our hybrid embeddings integration by rediscovering the crucial cues from complementary spatial–temporal embeddings. The better performance by multi-task learning indicates that anticipation task brings the additional knowledge to recognition task. Promising effectiveness and efficiency of our method also show its promising potential to be used in operating room.

Details

Language :
English
ISSN :
18616410 and 18616429
Volume :
17
Issue :
12
Database :
Supplemental Index
Journal :
International Journal of Computer Assisted Radiology and Surgery
Publication Type :
Periodical
Accession number :
ejs60857675
Full Text :
https://doi.org/10.1007/s11548-022-02743-8