Back to Search Start Over

98 An open-source foundation for head and neck radiomics.

Authors :
Scott, Katy L.
Kim, Sejin
Joseph, Jermiah J.
Boccalon, Matthew
Welch, Mattea
Yousafzai, Umar
Smith, Ian
Mcintosh, Chris
Rey-McIntyre, Katrina
Huang, Shao Hui
Patel, Tirth
Tadic, Tony
O'Sullivan, Brian
Bratman, Scott V.
Hope, Andrew J.
Haibe-Kains, Benjamin
Source :
Radiotherapy & Oncology. Mar2024:Supplement 1, Vol. 192, pS22-S25. 4p.
Publication Year :
2024

Abstract

With the purported future of oncological care being precision medicine, the hunt for predictive biomarkers has become a focal point. A potential source lies in radiological imaging, which has motivated the field of radiomics for the last decade [1–4]. Radiomics research, however, has been hampered by inconsistent methodology, despite efforts to establish standard features [5]. The release of the open-source PyRadiomics toolkit [6] was a significant and necessary step to standardize radiomics analysis, but the collation and distribution of publically available radiomics datasets remains poorly organized within the community. As a result, significant overhead remains when dealing with multiple training, testing, and validation datasets from both internal and external sources. Further, a recent study has raised the question of whether radiomic features with high predictive value are surrogates for tumour volume measurements [7]. There is a need for standard methodology for radiomic feature extraction, as well as large, publicly available radiomic datasets that have undergone rigorous processing to benchmark analyses. In this study, we have developed a reproducible, automated, open-source processing pipeline to generate analysis-ready radiomics data. We showcase the pipeline's capabilities by processing and analyzing the largest publicly available head and neck cancer (HNC) dataset, RADCURE [8], and compare three previously published radiomics models [1,7,9] using the resulting data. Data outputs have been made available via https://www.orcestra.ca/, a web-app that hosts processed 'omics data. Our proposed pipeline leverages three main tools: Med-ImageTools [10], PyRadiomics [6], and ORCESTRA [11]. While the former two are imaging-specific, we have modified ORCESTRA to work with clinical radiological data. The proposed pipeline was developed using the RADCURE [12] dataset. It consists of 3,346 HNC CT image volumes, corresponding radiotherapy structure sets (RTSTRUCT) containing primary gross tumour volume (GTVp) contours, and clinical data. The Med-ImageTools library was used to generate complete file lists for each CT acquisition, associate these with the correct RTSTRUCT, and load both as Simple ITK [13] images. For each GTVp, preprocessing, quality checking, and radiomic feature extraction was performed using PyRadiomics. Extraction settings from the RADCURE prognostic modelling challenge [8] were applied. Feature extraction was repeated with two negative control samples for each CT, either by shuffling voxel index values or randomly generating voxel values within the range of values in the original CT [7] (Figure 1). [Display omitted] The standard for data organization on ORCESTRA [11] is the MultiAssayExperiment R object [14], designed to harmonize multiple experimental assays from an overlapping patient set. To leverage this for radiomics, each set of extracted features becomes an experiment, with clinical data included as the primary metadata describing each patient. To demonstrate the pipeline's utility, we replicated previously published survival analysis models with the training and test cohorts from the RADCURE challenge subset [8]. Coefficients from the MW2018 [7] and Kwan [9] models were used to calculate prognostic index values for the test cohort. For comparison, we fit a Cox model to the RADCURE training cohort using the same radiomic signature and applied it to the test cohort. A univariate model for GTVp Mesh Volume was also tested. All models were compared using the concordance index. We processed 2,949 patients with GTVp contours, for a total of 2,988 GTVps from patients with varying primary tumour sites. We extracted 1,317 radiomic features from the CT and the negative control volume for each GTVp. For the 37 patients with multiple GTVps, features were extracted independently for each contour. The final data object containing all of these features, along with the clinical data and PyRadiomics configuration file, are available at https://www.orcestra.ca/radiomicset/10.5281/zenodo.8332910. The pipeline implementation is published at https://github.com/BHKLAB-DataProcessing/RADCUREradiomics. Results from our radiomics analysis are available in Table 1. The subset of 2400 GTVs was split into training and test cohorts based on the 'RADCURE-challenge' label in the clinical data. The Kwan model was tested with the oropharynx patients only. Model performance is similar whether the features were extracted from the CT or negative control samples, signaling that the radiomic signature is likely highly correlated with tumour volume, a known confounder of radiomics analysis. This is confirmed by the comparable performance of the univariable volume model. [Display omitted] This standardized architecture framework and the publicly available processed RADCURE dataset can be used to benchmark new datasets or radiomics models semi-automatically. Future work will include organ at risk and nodal targets in the RADCURE dataset and the production of ORCESTRA objects for other publicly available HNC datasets. We anticipate that this pipeline and the RADCURE objects generated could be a standard testing benchmark for future radiomics analyses and publications. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01678140
Volume :
192
Database :
Academic Search Index
Journal :
Radiotherapy & Oncology
Publication Type :
Academic Journal
Accession number :
176923744
Full Text :
https://doi.org/10.1016/S0167-8140(24)00437-7