Back to Search Start Over

Statistical modeling of STR capillary electrophoresis signal

Authors :
Slim Karkar
Lauren E. Alfonse
Catherine M. Grgicak
Desmond S. Lun
Source :
BMC Bioinformatics, Vol 20, Iss S16, Pp 1-12 (2019)
Publication Year :
2019
Publisher :
BMC, 2019.

Abstract

Abstract Background In order to isolate an individual’s genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which contains information about the length and number of STR units amplified. For samples collected from the environment, interpretation of the signal can be challenging given that information regarding the quality and quantity of the DNA is often limited. The signal can be further compounded by the presence of noise and PCR artifacts such as stutter which can mask or mimic biological alleles. Because manual interpretation methods cannot comprehensively account for such nuances, it would be valuable to develop a signal model that can effectively characterize the various components of STR signal independent of a priori knowledge of the quantity or quality of DNA. Results First, we seek to mathematically characterize the quality of the profile by measuring changes in the signal with respect to amplicon size. Next, we examine the noise, allele, and stutter components of the signal and develop distinct models for each. Using cross-validation and model selection, we identify a model that can be effectively utilized for downstream interpretation. Finally, we show an implementation of the model in NOCIt, a software system that calculates the a posteriori probability distribution on the number of contributors. Conclusion The model was selected using a large, diverse set of DNA samples obtained from 144 different laboratory conditions; with DNA amounts ranging from a single copy of DNA to hundreds of copies, and the quality of the profiles ranging from pristine to highly degraded. Implemented in NOCIt, the model enables a probabilisitc approach to estimating the number of contributors to complex, environmental samples.

Details

Language :
English
ISSN :
14712105
Volume :
20
Issue :
S16
Database :
Directory of Open Access Journals
Journal :
BMC Bioinformatics
Publication Type :
Academic Journal
Accession number :
edsdoj.0bafb71ed85b4be99c99378533ed4bc8
Document Type :
article
Full Text :
https://doi.org/10.1186/s12859-019-3074-0