Back to Search Start Over

Shotgun DNA sequencing for human identification: Dynamic SNP selection and likelihood ratio calculations accounting for errors.

Authors :
Andersen, Mikkel Meyer
Kampmann, Marie-Louise
Jepsen, Alberte Honoré
Morling, Niels
Eriksen, Poul Svante
Børsting, Claus
Andersen, Jeppe Dyrberg
Source :
Forensic Science International: Genetics; Jan2025, Vol. 74, pN.PAG-N.PAG, 1p
Publication Year :
2025

Abstract

Shotgun sequencing is a DNA analysis method that potentially determines the nucleotide sequence of every DNA fragment in a sample, unlike PCR-based genotyping methods that is widely used in forensic genetics and targets predefined short tandem repeats (STRs) or predefined single nucleotide polymorphisms (SNPs). Shotgun DNA sequencing is particularly useful for highly degraded low-quality DNA samples, such as ancient samples or those from crime scenes. Here, we developed a statistical model for human identification using shotgun sequencing data and developed formulas for calculating the evidential weight as a likelihood ratio (L R). The model uses a dynamic set of binary SNP loci and takes the error rate from shotgun sequencing into consideration in a probabilistic manner. To our knowledge, the method is the first to make this possible. Results from replicated shotgun sequencing of buccal swabs (high-quality samples) and hair samples (low-quality samples) were arranged in a genotype-call confusion matrix to estimate the calling error probability by maximum likelihood and Bayesian inference. Different genotype quality filters may be applied to account for genotyping errors. An error probability of zero resulted in the commonly used L R formula for the weight of evidence. Error probabilities above zero reduced the L R contribution of matching genotypes and increased the L R in the case of a mismatch between the genotypes of the trace and the person of interest. In the latter scenario, the L R increased from zero (occurring when the error probability was zero) to low positive values, which allow for the possibility that the mismatch may be due to genotyping errors. We developed an open-source R package, wgsLR, which implements the method, including estimation of the calling error probability and calculation of L R values. The R package includes all formulas used in this paper and the functionalities to generate the formulas. • Evidential weight as a likelihood ratio (L R) is given for shotgun sequencing data. • The L R accounts for the genotype calling error probability that we can estimate. • The method uses binary SNP loci chosen dynamically based on the trace sample. • We provide an open-source R package, wgsLR, which implements the method. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18724973
Volume :
74
Database :
Supplemental Index
Journal :
Forensic Science International: Genetics
Publication Type :
Academic Journal
Accession number :
181219569
Full Text :
https://doi.org/10.1016/j.fsigen.2024.103146