1. Ensemble Learning for Three-dimensional Medical Image Segmentation of Organ at Risk in Brachytherapy Using Double U-Net, Bi-directional ConvLSTM U-Net, and Transformer Network
- Author
-
Soniya Pal, Raj Pal Singh, and Anuj Kumar
- Subjects
brachytherapy ,image segmentation ,lstm u-net ,medical image segmentation ,organ at risk ,transformer ,u-net ,Medical physics. Medical radiology. Nuclear medicine ,R895-920 - Abstract
Aim: This article presents a novel approach to automate the segmentation of organ at risk (OAR) for high-dose-rate brachytherapy patients using three deep learning models combined with ensemble learning techniques. It aims to improve the accuracy and efficiency of segmentation. Materials and Methods: The dataset comprised computed tomography (CT) scans of 60 patients obtained from our own institutional image bank and 10 patients from the other institute, all in Digital Imaging and Communications in Medicine format. Experienced radiation oncologists manually segmented four OARs for each scan. Each scan was preprocessed and three models, Double U-Net (DUN), Bi-directional ConvLSTM U-Net (BCUN), and Transformer Networks (TN), were trained on reduced CT scans (240 × 240 × 128) due to memory limitations. Ensemble learning techniques were employed to enhance accuracy and segmentation metrics. Testing and validation were conducted on 12 patients from our institute (OID) and 10 patients from another institute (DID). Results: For DID test dataset, using the ensemble learning technique combining Transformer Network (TN) and BCUN, i.e., TN + BCUN, the average Dice similarity coefficient (DSC) ranged from 0.992 to 0.998, and for DUN and BCUN (DUN + BCUN) combination, the average DSC ranged from 0.990 to 0.993, which reflecting high segmentation accuracy. The 95% Hausdorff distance (HD) ranged from 0.9 to 1.2 mm for TN + BCUN and 1.1 to 1.4 mm for DUN + BCUN, demonstrating precise segmentation boundaries. Conclusion: The proposed method leverages the strengths of each network architecture. The DUN setup excels in sequential processing, the BCUN captures spatiotemporal dependencies, and transformer networks provide a robust understanding of global context. This combination enables efficient and accurate segmentation, surpassing human expert performance in both time and accuracy.
- Published
- 2024
- Full Text
- View/download PDF