1. Are we there yet? A machine learning architecture to predict organotropic metastases
- Author
-
Michael Skaro, Marcus Hill, Jonathan Arnold, Yi Zhou, Mandi M. Murph, Andrea Sboner, Shannon Quinn, and Melissa Davis
- Subjects
Databases, Factual ,Cancer metastasis ,Feature selection ,Disease ,Biology ,QH426-470 ,Machine learning ,computer.software_genre ,Transcriptome ,Machine Learning ,Neoplasms ,medicine ,Genetics ,Tissue specific ,Humans ,Internal medicine ,Genetics (clinical) ,Cancer ,Metastatic organotropism ,Transcriptomic profiling ,business.industry ,Gene Expression Profiling ,medicine.disease ,RC31-1245 ,Human genetics ,Technical Advance ,Artificial intelligence ,DNA microarray ,business ,computer - Abstract
Background & Aims Cancer metastasis into distant organs is an evolutionarily selective process. A better understanding of the driving forces endowing proliferative plasticity of tumor seeds in distant soils is required to develop and adapt better treatment systems for this lethal stage of the disease. To this end, we aimed to utilize transcript expression profiling features to predict the site-specific metastases of primary tumors and second, to identify the determinants of tissue specific progression. Methods We used statistical machine learning for transcript feature selection to optimize classification and built tree-based classifiers to predict tissue specific sites of metastatic progression. Results We developed a novel machine learning architecture that analyzes 33 types of RNA transcriptome profiles from The Cancer Genome Atlas (TCGA) database. Our classifier identifies the tumor type, derives synthetic instances of primary tumors metastasizing to distant organs and classifies the site-specific metastases in 16 types of cancers metastasizing to 12 locations. Conclusions We have demonstrated that site specific metastatic progression is predictable using transcriptomic profiling data from primary tumors and that the overrepresented biological processes in tumors metastasizing to congruent distant loci are highly overlapping. These results indicate site-specific progression was organotropic and core features of biological signaling pathways are identifiable that may describe proliferative plasticity in distant soils.
- Published
- 2021