Back to Search
Start Over
Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks
- Source :
- Medical Image Analysis, Medical Image Analysis, Elsevier, 2018, 47, pp.203-218. ⟨10.1016/j.media.2018.05.001⟩, Medical Image Analysis, 2018, 47, pp.203-218. ⟨10.1016/j.media.2018.05.001⟩
- Publication Year :
- 2018
- Publisher :
- HAL CCSD, 2018.
-
Abstract
- This paper investigates the automatic monitoring of tool usage during a surgery, with potential applications in report generation, surgical training and real-time decision support. Two surgeries are considered: cataract surgery, the most common surgical procedure, and cholecystectomy, one of the most common digestive surgeries. Tool usage is monitored in videos recorded either through a microscope (cataract surgery) or an endoscope (cholecystectomy). Following state-of-the-art video analysis solutions, each frame of the video is analyzed by convolutional neural networks (CNNs) whose outputs are fed to recurrent neural networks (RNNs) in order to take temporal relationships between events into account. Novelty lies in the way those CNNs and RNNs are trained. Computational complexity prevents the end-to-end training of "CNN+RNN" systems. Therefore, CNNs are usually trained first, independently from the RNNs. This approach is clearly suboptimal for surgical tool analysis: many tools are very similar to one another, but they can generally be differentiated based on past events. CNNs should be trained to extract the most useful visual features in combination with the temporal context. A novel boosting strategy is proposed to achieve this goal: the CNN and RNN parts of the system are simultaneously enriched by progressively adding weak classifiers (either CNNs or RNNs) trained to improve the overall classification accuracy. Experiments were performed in a dataset of 50 cataract surgery videos and a dataset of 80 cholecystectomy videos. Very good classification performance are achieved in both datasets: tool usage could be labeled with an average area under the ROC curve of $A_z = 0.9961$ and $A_z = 0.9939$, respectively, in offline mode (using past, present and future information), and $A_z = 0.9957$ and $A_z = 0.9936$, respectively, in online mode (using past and present information only).<br />Accepted for publication in Medical Image Analysis
- Subjects :
- FOS: Computer and information sciences
Decision support system
medicine.medical_specialty
Boosting (machine learning)
Computational complexity theory
Computer science
Computer Vision and Pattern Recognition (cs.CV)
Video Recording
Computer Science - Computer Vision and Pattern Recognition
Health Informatics
02 engineering and technology
Cataract Extraction
Convolutional neural network
030218 nuclear medicine & medical imaging
03 medical and health sciences
0302 clinical medicine
0202 electrical engineering, electronic engineering, information engineering
medicine
Image Processing, Computer-Assisted
Humans
Radiology, Nuclear Medicine and imaging
Cholecystectomy
[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing
Radiological and Ultrasound Technology
Frame (networking)
Novelty
Computer Graphics and Computer-Aided Design
Surgery
Recurrent neural network
020201 artificial intelligence & image processing
Computer Vision and Pattern Recognition
Neural Networks, Computer
Monitoring tool
[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
Algorithms
Subjects
Details
- Language :
- French
- ISSN :
- 13618415 and 13618423
- Database :
- OpenAIRE
- Journal :
- Medical Image Analysis, Medical Image Analysis, Elsevier, 2018, 47, pp.203-218. ⟨10.1016/j.media.2018.05.001⟩, Medical Image Analysis, 2018, 47, pp.203-218. ⟨10.1016/j.media.2018.05.001⟩
- Accession number :
- edsair.doi.dedup.....790b8d36a2dbb5e4b8f12ebb415f2d6a