Back to Search Start Over

Neural ADMIXTURE for rapid genomic clustering.

Authors :
Mantes AD
Montserrat DM
Bustamante CD
GirĂ³-I-Nieto X
Ioannidis AG
Source :
Nature computational science [Nat Comput Sci] 2023 Jul; Vol. 3 (7), pp. 621-629. Date of Electronic Publication: 2023 Jul 06.
Publication Year :
2023

Abstract

Characterizing the genetic structure of large cohorts has become increasingly important as genetic studies extend to massive, increasingly diverse biobanks. Popular methods decompose individual genomes into fractional cluster assignments with each cluster representing a vector of DNA variant frequencies. However, with rapidly increasing biobank sizes, these methods have become computationally intractable. Here we present Neural ADMIXTURE, a neural network autoencoder that follows the same modeling assumptions as the current standard algorithm, ADMIXTURE, while reducing the compute time by orders of magnitude surpassing even the fastest alternatives. One month of continuous compute using ADMIXTURE can be reduced to just hours with Neural ADMIXTURE. A multi-head approach allows Neural ADMIXTURE to offer even further acceleration by calculating multiple cluster numbers in a single run. Furthermore, the models can be stored, allowing cluster assignment to be performed on new data in linear time without needing to share the training samples.<br />Competing Interests: Competing interests: C.D.B. is the CEO of Galatea Bio, and A.G.I. also holds shares. The remaining authors declare no competing interests.

Details

Language :
English
ISSN :
2662-8457
Volume :
3
Issue :
7
Database :
MEDLINE
Journal :
Nature computational science
Publication Type :
Academic Journal
Accession number :
37600116
Full Text :
https://doi.org/10.1038/s43588-023-00482-7