1. Rare copy number variant analysis in case-control studies using snp array data: a scalable and automated data analysis pipeline.
- Author
-
Artaza H, Lavrichenko K, Wolff ASB, Røyrvik EC, Vaudel M, and Johansson S
- Subjects
- Humans, Case-Control Studies, Genome, Human, Oligonucleotide Array Sequence Analysis methods, Software, Data Analysis, Quality Control, Polymorphism, Single Nucleotide genetics, DNA Copy Number Variations genetics, Computational Biology methods
- Abstract
Background: Rare copy number variants (CNVs) significantly influence the human genome and may contribute to disease susceptibility. High-throughput SNP genotyping platforms provide data that can be used for CNV detection, but it requires the complex pipelining of bioinformatic tools. Here, we propose a flexible bioinformatic pipeline for rare CNV analysis from human SNP array data., Results: The pipeline consists of two major sub-pipelines: (1) Calling and quality control (QC) analysis, and (2) Rare CNV analysis. It is implemented in Snakemake following a rule-based structure that enables automation and scalability while maintaining flexibility., Conclusions: Our pipeline automates the detection and analysis of rare CNVs. It implements a rigorous CNV quality control, assesses the frequencies of these rare CNVs in patients versus controls, and evaluates the impact of CNVs on specific genes or pathways. We hence aim to provide an efficient yet flexible bioinformatic framework to investigate rare CNVs in biomedical research., Competing Interests: Declarations Competing interests The authors declare that they have no competing interests., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF