Back to Search Start Over

Accurate prediction of quantitative traits with failed SNP calls in canola and maize

Authors :
Sven E. Weber
Harmeet Singh Chawla
Lennard Ehrig
Lee T. Hickey
Matthias Frisch
Rod J. Snowdon
Source :
Frontiers in Plant Science, Vol 14 (2023)
Publication Year :
2023
Publisher :
Frontiers Media S.A., 2023.

Abstract

In modern plant breeding, genomic selection is becoming the gold standard to select superior genotypes in large breeding populations that are only partially phenotyped. Many breeding programs commonly rely on single-nucleotide polymorphism (SNP) markers to capture genome-wide data for selection candidates. For this purpose, SNP arrays with moderate to high marker density represent a robust and cost-effective tool to generate reproducible, easy-to-handle, high-throughput genotype data from large-scale breeding populations. However, SNP arrays are prone to technical errors that lead to failed allele calls. To overcome this problem, failed calls are often imputed, based on the assumption that failed SNP calls are purely technical. However, this ignores the biological causes for failed calls—for example: deletions—and there is increasing evidence that gene presence–absence and other kinds of genome structural variants can play a role in phenotypic expression. Because deletions are frequently not in linkage disequilibrium with their flanking SNPs, permutation of missing SNP calls can potentially obscure valuable marker–trait associations. In this study, we analyze published datasets for canola and maize using four parametric and two machine learning models and demonstrate that failed allele calls in genomic prediction are highly predictive for important agronomic traits. We present two statistical pipelines, based on population structure and linkage disequilibrium, that enable the filtering of failed SNP calls that are likely caused by biological reasons. For the population and trait examined, prediction accuracy based on these filtered failed allele calls was competitive to standard SNP-based prediction, underlying the potential value of missing data in genomic prediction approaches. The combination of SNPs with all failed allele calls or the filtered allele calls did not outperform predictions with only SNP-based prediction due to redundancy in genomic relationship estimates.

Details

Language :
English
ISSN :
1664462X
Volume :
14
Database :
Directory of Open Access Journals
Journal :
Frontiers in Plant Science
Publication Type :
Academic Journal
Accession number :
edsdoj.919fef36b98741898b87eb9242a35ed6
Document Type :
article
Full Text :
https://doi.org/10.3389/fpls.2023.1221750