1. PyPop: a mature open-source software pipeline for population genomics
- Author
-
Lancaster, Alexander K, Single, Richard M, Mack, Steven J, Sochat, Vanessa, Mariani, Michael P, and Webster, Gordon D
- Subjects
Biological Sciences ,Genetics ,Human Genome ,Generic health relevance ,Genetics ,Population ,Genotype ,Haplotypes ,Metagenomics ,Software ,Meta-Analysis as Topic ,HLA ,MHC ,population genomics ,software ,bioinformatics ,Immunology ,Medical Microbiology ,Biochemistry and cell biology - Abstract
Python for Population Genomics (PyPop) is a software package that processes genotype and allele data and performs large-scale population genetic analyses on highly polymorphic multi-locus genotype data. In particular, PyPop tests data conformity to Hardy-Weinberg equilibrium expectations, performs Ewens-Watterson tests for selection, estimates haplotype frequencies, measures linkage disequilibrium, and tests significance. Standardized means of performing these tests is key for contemporary studies of evolutionary biology and population genetics, and these tests are central to genetic studies of disease association as well. Here, we present PyPop 1.0.0, a new major release of the package, which implements new features using the more robust infrastructure of GitHub, and is distributed via the industry-standard Python Package Index. New features include implementation of the asymmetric linkage disequilibrium measures and, of particular interest to the immunogenetics research communities, support for modern nomenclature, including colon-delimited allele names, and improvements to meta-analysis features for aggregating outputs for multiple populations. Code available at: https://zenodo.org/records/10080668 and https://github.com/alexlancaster/pypop.
- Published
- 2024