1. Single-cell investigative genetics: Single-cell data produces genotype distributions concentrated at the true genotype across all mixture complexities.
- Author
-
Grgicak CM, Bhembe Q, Slooten K, Sheth NC, Duffy KR, and Lun DS
- Subjects
- Humans, Bayes Theorem, Genotype, DNA genetics, Likelihood Functions, DNA Fingerprinting methods, Microsatellite Repeats
- Abstract
In the absence of a suspect the forensic aim is investigative, and the focus is one of discerning what genotypes best explain the evidence. In traditional systems, the list of candidate genotypes may become vast if the sample contains DNA from many donors or the information from a minor contributor is swamped by that of major contributors, leading to lower evidential value for a true donor's contribution and, as a result, possibly overlooked or inefficient investigative leads. Recent developments in single-cell analysis offer a way forward, by producing data capable of discriminating genotypes. This is accomplished by first clustering single-cell data by similarity without reference to a known genotype. With good clustering it is reasonable to assume that the scEPGs in a cluster are of a single contributor. With that assumption we determine the probability of a cluster's content given each possible genotype at each locus, which is then used to determine the posterior probability mass distribution for all genotypes by application of Bayes' rule. A decision criterion is then applied such that the sum of the ranked probabilities of all genotypes falling in the set is at least 1-α. This is the credible genotype set and is used to inform database search criteria. Within this work we demonstrate the salience of single-cell analysis by performance testing a set of 630 previously constructed admixtures containing up to 5 donors of balanced and unbalanced contributions. We use scEPGs that were generated by isolating single cells, employing a direct-to-PCR extraction treatment, amplifying STRs that are compliant with existing national databases and applying post-PCR treatments that elicit a detection limit of one DNA copy. We determined that, for these test data, 99.3% of the true genotypes are included in the 99.8% credible set, regardless of the number of donors that comprised the mixture. We also determined that the most probable genotype was the true genotype for 97% of the loci when the number of cells in a cluster was at least two. Since efficient investigative leads will be borne by posterior mass distributions that are narrow and concentrated at the true genotype, we report that, for this test set, 47,900 (86%) loci returned only one credible genotype and of these 47,551 (99%) were the true genotype. When determining the LR for true contributors, 91% of the clusters rendered LR>10
18 , showing the potential of single-cell data to positively affect investigative reporting., Competing Interests: Declaration of Competing Interest Catherine M. Grgicak, Desmond S. Lun and Ken R. Duffy are authors of US Patent Application for SYSTEMS AND METHODS FOR AUTOMATED ANALYSES OF A BIOLOGICAL SAMPLE Patent Application (Application #20220270712)., (Copyright © 2024 Elsevier B.V. All rights reserved.)- Published
- 2024
- Full Text
- View/download PDF