1. A k-mer-based pangenome approach for cataloging seed-storage-protein genes in wheat to facilitate genotype-to-phenotype prediction and improvement of end-use quality.
- Author
-
Zhang, Zhaoheng, Liu, Dan, Li, Binyong, Wang, Wenxi, Zhang, Jize, Xin, Mingming, Hu, Zhaorong, Liu, Jie, Du, Jinkun, Peng, Huiru, Hao, Chenyang, Zhang, Xueyong, Ni, Zhongfu, Sun, Qixin, Guo, Weilong, and Yao, Yingyin
- Abstract
Wheat is a staple food for more than 35% of the world's population, with wheat flour used to make hundreds of baked goods. Superior end-use quality is a major breeding target; however, improving it is especially time-consuming and expensive. Furthermore, genes encoding seed-storage proteins (SSPs) form multi-gene families and are repetitive, with gaps commonplace in several genome assemblies. To overcome these barriers and efficiently identify superior wheat SSP alleles, we developed "PanSK" (Pan-SSP k -mer) for genotype-to-phenotype prediction based on an SSP-based pangenome resource. PanSK uses 29-mer sequences that represent each SSP gene at the pangenomic level to reveal untapped diversity across landraces and modern cultivars. Genome-wide association studies with k -mers identified 23 SSP genes associated with end-use quality that represent novel targets for improvement. We evaluated the effect of rye secalin genes on end-use quality and found that removal of ω-secalins from 1BL/1RS wheat translocation lines is associated with enhanced end-use quality. Finally, using machine-learning-based prediction inspired by PanSK, we predicted the quality phenotypes with high accuracy from genotypes alone. This study provides an effective approach for genome design based on SSP genes, enabling the breeding of wheat varieties with superior processing capabilities and improved end-use quality. PanSK is a pangenomic approach based on k -mers that catalogs seed-storage protein (SSP) genes in wheat, enabling prediction and enhancement of end-use quality. Using 29-mer sequences to represent SSP gene diversity at the pangenome level, PanSK facilitates the identification of key genes linked to superior processing qualities. Combined with machine-learning predictions, it significantly advances the breeding of wheat varieties with improved end-use characteristics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF