1. Scywalker: scalable end-to-end data analysis workflow for long-read single-cell transcriptome sequencing.
- Author
-
Rijk, Peter De, Watzeels, Tijs, Küçükali, Fahri, Dongen, Jasper Van, Faura, Júlia, Willems, Patrick, Deyn, Lara De, Duchateau, Lena, Grones, Carolin, Eekhout, Thomas, Pooter, Tim De, Joris, Geert, Rombauts, Stephane, Rybel, Bert De, Rademakers, Rosa, Breusegem, Frank Van, Strazisar, Mojca, Sleegers, Kristel, and Coster, Wouter De
- Subjects
STATISTICAL correlation ,QUALITY control ,DEMULTIPLEXING ,DATA analysis ,CELL lines - Abstract
Motivation Existing nanopore single-cell data analysis tools showed severe limitations in handling current data sizes. Results We introduce scywalker , an innovative and scalable package developed to comprehensively analyze long-read sequencing data of full-length single-cell or single-nuclei cDNA. We developed novel scalable methods for cell barcode demultiplexing and single-cell isoform calling and quantification and incorporated these in an easily deployable package. Scywalker streamlines the entire analysis process, from sequenced fragments in FASTQ format to demultiplexed pseudobulk isoform counts, into a single command suitable for execution on either server or cluster. Scywalker includes data quality control, cell type identification, and an interactive report. Assessment of datasets from the human brain, Arabidopsis leaves, and previously benchmarked data from mixed cell lines demonstrate excellent correlation with short-read analyses at both the cell-barcoding and gene quantification levels. At the isoform level, we show that scywalker facilitates the direct identification of cell-type-specific expression of novel isoforms. Availability and implementation Scywalker is available on github.com/derijkp/scywalker under the GNU General Public License (GPL) and at https://zenodo.org/records/13359438/files/scywalker-0.108.0-Linux-x86_64.tar.gz. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF