1. PeSV-fisher : identification of somatic and non-somatic structural variants using next generation sequencing data
- Author
-
Mario Cáceres, Stephan Ossowski, Geòrgia Escaramís, Alexander Martínez-Fundichely, Laia Bassaganyas, Cristian Tornador, Xavier Estivill, Raquel Rabionet, Marta Gut, and Jose M. C. Tubio
- Subjects
lcsh:Medicine ,Human genomics ,Genome ,Chromosomal Disorders ,0302 clinical medicine ,Sequence alignment ,Databases, Genetic ,Cancer genomics ,Genomic library ,Genome Sequencing ,lcsh:Science ,Genètica -- Bases de dades ,Cancers and neoplasms ,Genetics ,0303 health sciences ,Multidisciplinary ,Chromosomal Deletions and Duplications ,ADN -- Anàlisi ,Genomics ,Genome Scans ,Identification (information) ,Translocations ,Sequence Analysis ,Algorithms ,Research Article ,Genome evolution ,Sequence analysis ,Computational biology ,Biology ,Genome Complexity ,DNA sequencing ,03 medical and health sciences ,Genome Analysis Tools ,Cancer Genetics ,Humans ,030304 developmental biology ,Comparative genomics ,Genome, Human ,lcsh:R ,Breakpoint ,Computational Biology ,Human Genetics ,Sequence Analysis, DNA ,Genome analysis ,Leukemia, Lymphocytic, Chronic, B-Cell ,Genòmica ,Genomic Structural Variation ,Computer Science ,Genetics of Disease ,lcsh:Q ,Structural Genomics ,Transposable elements ,030217 neurology & neurosurgery - Abstract
Next-generation sequencing technologies expedited research to develop efficient computational tools for the identification of structural variants (SVs) and their use to study human diseases. As deeper data is obtained, the existence of higher complexity SVs in some genomes becomes more evident, but the detection and definition of most of these complex rearrangements is still in its infancy. The full characterization of SVs is a key aspect for discovering their biological implications. Here we present a pipeline (PeSV-Fisher) for the detection of deletions, gains, intra- and inter-chromosomal translocations, and inversions, at very reasonable computational costs. We further provide comprehensive information on co-localization of SVs in the genome, a crucial aspect for studying their biological consequences. The algorithm uses a combination of methods based on paired-reads and read-depth strategies. PeSV-Fisher has been designed with the aim to facilitate identification of somatic variation, and, as such, it is capable of analysing two or more samples simultaneously, producing a list of non-shared variants between samples. We tested PeSV-Fisher on available sequencing data, and compared its behaviour to that of frequently deployed tools (BreakDancer and VariationHunter). We have also tested this algorithm on our own sequencing data, obtained from a tumour and a normal blood sample of a patient with chronic lymphocytic leukaemia, on which we have also validated the results by targeted re-sequencing of different kinds of predictions. This allowed us to determine confidence parameters that influence the reliability of breakpoint predictions. This work was supported by AGAUR (Generalitat de Catalunya, 2009 SGR 1502) (X.E.); CIBERESP (Instituto de Salud Carlos III) (G.E.); ESGI (European Commission, 262055_ESGI) (R.R., X.E.), ENGAGE (European Commission, ENGAGE_201413), TECHGENE (European Commission, TECHGENE_223143), and GEUVADIS (European Commission, 261123_GEUVADIS) (X.E.); NOVADIS (Ministerio de Ciencia y Technologia, SAF2008-00357) (X.E.); Galicia Government Xunta de Galicia (Spain) through the project number 10PXIB918057 (J.M.C.T.); MAEC-AEC1 Predoctoral Fellowship (Ministerio de Asuntos Exteriores y Cooperación, Spain) (A.M.F.); and Ramón y Cajal position and grant BFU2007-60930 (Ministerio de Educación y Ciencia) (M.C.).
- Published
- 2021