Jackson A, Virdee PS, Tonner S, Oke JL, Perera R, Riahi K, Luan Y, Hiom S, Kumar H, Nandani H, Kurtzman KN, Huws D, Allan D, Smits S, McPhail S, Parkes EE, Hobbs FDR, Middleton MR, and Nicholson BD
Background: Cancer places a high burden on society and health-care systems. Cancer research requires high-quality data, which is resource-intensive to obtain. Using administrative datasets such as cancer registries could improve the efficiency of cancer studies if data were valid and timely. We aimed to compare the validity and timeliness of diagnostic cancer data on-site during the SYMPLIFY study to that obtained from the cancer registries of England and Wales., Methods: Cancer data were collected from 5461 participants across 44 hospital sites during a prospective observational study in England and Wales, SYMPLIFY (ISRCTN10226380). Linked cancer data were obtained from Digital Health and Care Wales (DHCW), the Welsh Cancer Intelligence and Surveillance Unit (WCISU), and the English National Cancer Registration Dataset (NCRD) and Rapid Cancer Registration Dataset (RCRD), regularly between April, 2022, and September, 2023. The primary objectives of the study were to evaluate the validity (via assessment of the proportion of completed data fields and concordance with SYMPLIFY sites), and timeliness of the data in all datasets, for all cancers diagnosed within 9 months of study enrolment. Data fields investigated were cancer site via International Classification of Disease, 10th Revision (ICD-10) code; cancer morphology via International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3) morphology histology code and broad morphological grouping; overall stage; and TNM classification., Findings: For data collected between April, 2022, and September, 2023, completeness at the last data cut available for each dataset ranged from 84% to 100% for ICD-O-3 morphology, from 43% to 100% for overall stage, and from 74% to 83% for TNM stage. The concordance between SYMPLIFY data and NCRD was 96% (95% CI 92-98) for ICD-10, 60% (53-66) for ICD-O-3 morphology, 83% (78-88) for ICD-O-3 broad morphology groupings, 73% (67-78) for stage, and 51% (44-59) for TNM; and with WCISU was 89% (95% CI 81-94) for ICD-10, 63% (53-73) for ICD-O-3 morphology, 80% (70-87) for ICD-O-3 broad morphology groupings, 83% (74-90) for overall stage, and 49% (38-61) for TNM stage. Concordance between SYMPLIFY and RCRD was 95% (95% CI 92-98) for ICD-10, 67% (60-74) for ICD-O-3 morphology, 85% (79-90) for ICD-O-3 broad morphology groupings, and 73% (65-80) for overall stage; and between SYMPLIFY and DHCW was 96% (91-99) for ICD-10, 74% (64-83) for ICD-O-3 morphology, 84% (75-91) for ICD-O-3 broad morphology groupings, and 87% (74-95) for stage. The SYMPLIFY dataset reached completion at 12 months post-enrolment in November, 2022, compared with 13 months for NCRD in December, 2023. RCRD and DHCW reached completion at 13 months and 15 months post-enrolment, in December, 2022, and February, 2023, respectively., Interpretation: We report similar completeness of data fields, concordance, and timeliness between on-site and centrally collected cancer outcomes data. Our findings suggest that central registry data can help alleviate the resource burden in clinical trials and improve cancer research. Cancer registries might need additional resources to provide data for registry-based trials at scale., Funding: GRAIL Bio UK., Competing Interests: Declaration of interests BDN and MRM receive institutional research funding from GRAIL. BDN reports grants, honoraria, and consulting fees from the National Institute for Health and Care Research (NIHR), Cancer Research UK, Royal College of General Practitioners, and Multi-Cancer Early Detection Consortium. DH and DA report payment to their institution from National Health Service Wales Cancer Network for additional cancer registration officer time to expedite population-based cancer registration for this study. DH and DA report grants from Moondance Cancer Initiative. FDRN reports honoraria for occasional sessional payment for congress talks for AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, and Pfizer. RP reports funding from GRAIL and National Health Service England for the completion of this study, as well as grants from Medical Research Council, NIHR, and the Medical Science Division at the University of Oxford and is a statistical editor for BMJ and BMJ Medicine. HK, KNK, SH, KR, YL, and HN were all employees of GRAIL at the time of the study. HK reports a leadership position with GRAIL. All other authors declare no competing interests., (Copyright © 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.)