Back to Search Start Over

531. Computational Pipeline for the Identification of Integration Sites and Novel Method for the Quantification of Clone Sizes in Clonal Tracking Studies

Authors :
Serena Scala
Luca Biasco
Luca Basso Ricci
Davide Cittaro
Danilo Pellin
Francesca Dionisio
Alessandro Aiuti
Clelia Di Serio
Lorena Leonardelli
Source :
Molecular Therapy. 24:S212-S213
Publication Year :
2016
Publisher :
Elsevier BV, 2016.

Abstract

Gene-corrected cells in Gene Therapy (GT) treated patients can be tracked in vivo by means of vector integration site (IS) analysis, since each engineered clone becomes univocally and stably marked by an individual IS. As the proper IS identification and quantification is crucial to accurately perform clonal tracking studies, we designed a customizable and tailored pipeline to analyze LAM-PCR amplicons sequenced by Illumina MiSeq/HiSeq technology. The sequencing data are initially processed through a series of quality filters and cleaned from vector and Linker Cassette (LC) sequences with customizable settings. Demultiplexing is then performed according to the recognition of specific barcodes combination used upon library preparation and the sequences are aligned to the reference genome. Importantly, the human genome assembly Hg19 is composed of 93 contigs, among which the mitochondrial genome, unlocalized and unplaced contigs and some alternative haplotypes of chr6. While previous approaches aligned IS sequences only to the standard 24 human chromosomes, using the whole assembled genome allowed improving alignment accuracy and concomitantly increased the amount of detectable ISs. To date, we have processed 28 independent human sample sets retrieving 260,994 ISs from 189,270,566 sequencing reads. Although, sequencing read counts at each IS have been widely used to estimate the relative IS abundance, this method carries inherent accuracy constraints due to the rounds of exponential amplification required by LAM-PCR that might generate unbalances on the original clonal representation. More recently, a method based on genomic sonication has been proposed exploiting shear site counts to tag the number of original fragments belonging to each IS before PCR amplification. However, the number of cells composing a given clone could far exceed the number of fragments of different lengths that can be generated upon fragmentation in proximity of that given IS. This would rapidly saturate the available diversity of shear sites and progressively generate more and more same-site shearing on independent genomes. In order to overcome the described biases and reliably quantify ISs, we designed and tested a new LC encoding random barcodes. The new LC is composed of a known sequence of 29nt used as binding site for the primers upon amplification steps, a 6nt-random barcode, a fixed-anchor sequence of 6nt, a second 6nt-random barcode and a final known sequence of 22nt containing sticky ends for the three main restriction enzymes in use (MluI, HpyCH4IV and AciI). This peculiar design allowed increasing the accuracy of clonal diversity estimation since the fixed-anchor sequence acts as a control for sequencing reliability in the barcode area. The theoretical number of different available barcodes per clone (412=16,777,216) far exceeds the requirements for not saturating the original diversity of the analyzed sample (on average composed by around 50.000 cells). We validated this novel approach by performing assays on serial dilutions of individual clones carrying known ISs. The precision rate obtained was averagely around 99.3%, while the worst error rate reaches at most the 1.86%, confirming the reliability of IS quantification. We successfully applied the barcoded-LC system to the analysis of clinical samples from a Wiskott Aldrich Syndrome GT patient, collecting to date 50,215 barcoded ISs from 94,052,785 sequencing reads.

Details

ISSN :
15250016
Volume :
24
Database :
OpenAIRE
Journal :
Molecular Therapy
Accession number :
edsair.doi...........de202fa4c8f2dbfa93480abb4347974d
Full Text :
https://doi.org/10.1016/s1525-0016(16)33340-8