1. Integrated Computational Pipeline for Single-Cell Genomic Profiling
- Author
-
Jude Kendall, Joan Alexander, Michael Wigler, Lubomir Chorbadjiev, Viacheslav Zhygulin, Alexander Krasnitz, and Junyan Song
- Subjects
0303 health sciences ,Genomic profiling ,Genome ,Cell ,Computational Biology ,General Medicine ,Computational biology ,ORIGINAL REPORTS ,Genomics ,Biology ,03 medical and health sciences ,0302 clinical medicine ,medicine.anatomical_structure ,Special Series: Informatics Tools for Cancer Research and Care ,medicine ,Profiling (information science) ,Humans ,030217 neurology & neurosurgery ,Software ,030304 developmental biology ,Tissue biopsy - Abstract
PURPOSE Copy-number profiling of multiple individual cells from sparse sequencing may be used to reveal a detailed picture of genomic heterogeneity and clonal organization in a tissue biopsy specimen. We sought to provide a comprehensive computational pipeline for single-cell genomics, to facilitate adoption of this molecular technology for basic and translational research. MATERIALS AND METHODS The pipeline comprises software tools programmed in Python and in R and depends on Bowtie, HISAT2, Matplotlib, and Qt. It is installed and used with Anaconda. RESULTS Here we describe a complete pipeline for sparse single-cell genomic data, encompassing all steps of single-nucleus DNA copy-number profiling, from raw sequence processing to clonal structure analysis and visualization. For the latter, a specialized graphical user interface termed the single-cell genome viewer (SCGV) is provided. With applications to cancer diagnostics in mind, the SCGV allows for zooming and linkage to the University of California at Santa Cruz Genome Browser from each of the multiple integrated views of single-cell copy-number profiles. The latter can be organized by clonal substructure or by any of the associated metadata such as anatomic location and histologic characterization. CONCLUSION The pipeline is available as open-source software for Linux and OS X. Its modular structure, extensive documentation, and ease of deployment using Anaconda facilitate its adoption by researchers and practitioners of single-cell genomics. With open-source availability and Massachusetts Institute of Technology licensing, it provides a basis for additional development by the cancer bioinformatics community.
- Published
- 2020