1. Abstract 6582: A path to sustainability for the National Cancer Institute's Cancer Research Data Commons
- Author
-
Juergen A. Klenk, Angela Maggio, Bhavani S. Singh, Erin N. Byrne, Erika Kim, and Tanja M. Davidsen
- Subjects
Cancer Research ,Oncology - Abstract
Purpose: The volume of data produced by the biomedical research community continues to grow rapidly. Efficient collection, curation, and sharing of these data are expected to enable researchers to synthesize larger datasets and apply novel methods to discover new patterns for diagnosis, treatment, and care of disease. Consequently, data platforms that support these functionalities have become an important instrument for the biomedical research community. The National Cancer Institute’s (NCI’s) Cancer Research Data Commons (CRDC) is a data platform for the cancer research community that provides cloud-based, secure storage and analytic tools for cancer data, including genomics, proteomics, imaging, and clinical trial data. CRDC has been funded by multiple sources, including the Beau Biden Cancer Moonshot program. Achieving long-term sustainability of the CRDC program is the focus of our study. Methods: We applied a Financial Operations (FinOps) framework to map all costs to functionalities provided by the CRDC, including processes such as data intake, storage, compute, user support, and project management. We established a comprehensive financial baseline for the operations of the CRDC, identified opportunities for optimization across multiple dimensions (people, processes, technology), projected future costs based on trends for cancer data, and evaluated sustainability under defined funding scenarios. Results: We identified 15 recommendations for optimization of the CRDC, including automation and centralization of core services (e.g., intake, curation, indexing), optimization of storage (e.g., compression, archiving), and harmonization of common services across the CRDC components (e.g., common architecture, governance). If implemented, we could demonstrate that these recommendations offer a path to long-term sustainability of the CRDC program under a variety of funding scenarios. Conclusion: Data platforms such as the CRDC will see rapid increases in operational costs due to the growing data volumes and demand from researchers. Optimizing governance and operational frameworks will result in efficiency gains that can ensure long-term sustainability. Taking these steps will be critical to provide the necessary infrastructure to advance the field of data-driven biomedical research. Citation Format: Juergen A. Klenk, Angela Maggio, Bhavani S. Singh, Erin N. Byrne, Erika Kim, Tanja M. Davidsen. A path to sustainability for the National Cancer Institute's Cancer Research Data Commons. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 6582.
- Published
- 2023