387 results on '"Calus, Mario P. L."'
Search Results
102. Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein–Friesian cattle
- Author
-
Veerkamp, Roel F., primary, Bouwman, Aniek C., additional, Schrooten, Chris, additional, and Calus, Mario P. L., additional
- Published
- 2016
- Full Text
- View/download PDF
103. The impact of whole genome sequence data to prioritise animals for genetic diversity conservation
- Author
-
Eynard, Sonia, Windig, Jack J., Hiemstra, Sipke-Joost, Calus, Mario P. L., Génétique Animale et Biologie Intégrative (GABI), Institut National de la Recherche Agronomique (INRA)-AgroParisTech, Animal Breeding and Genomics, Wageningen University and Research [Wageningen] (WUR), Center for Genetic Resources, Centre for Genetic Resources, and AgroParisTech-Institut National de la Recherche Agronomique (INRA)
- Subjects
[SDV.GEN]Life Sciences [q-bio]/Genetics ,[SDV.GEN.GA]Life Sciences [q-bio]/Genetics/Animal genetics ,[SDV]Life Sciences [q-bio] ,conservation ,genetic ,genome ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2015
104. Genomic prediction for crossbred performance using metafounders 1.
- Author
-
Grevenhof, Elizabeth M van, Vandenplas, Jérémie, and Calus, Mario P L
- Subjects
CROSSBREEDING ,ANIMAL breeding ,SIMULATION methods & models ,DEFINITIONS - Abstract
Future genomic evaluation models to be used routinely in breeding programs for pigs and poultry need to be able to optimally use information of crossbred (CB) animals to predict breeding values for CB performance of purebred (PB) selection candidates. Important challenges in the commonly used single-step genomic best linear unbiased prediction (ssGBLUP) model are the definition of relationships between the different line compositions and the definition of the base generation per line. The use of metafounders (MFs) in ssGBLUP has been proposed to overcome these issues. When relationships between lines are known to be different from 0, the use of MFs generalizes the concept of genetic groups relying on the genotype data. Our objective was to investigate the effect of using MFs in genomic prediction for CB performance on estimated variance components, and accuracy and bias of GEBV. This was studied using stochastic simulation to generate data representing a three-way crossbreeding scheme in pigs, with the parental lines being either closely related or unrelated. Results show that using MFs, the variance components should be scaled appropriately, especially when basing them on estimates obtained with, for example a pedigree-based model. The accuracies of GEBV that were obtained using MFs were similar to accuracies without using MFs, regardless whether the lines involved in the CB were closely related or unrelated. The use of MFs resulted in a model that had similar or somewhat better convergence properties compared to other models. We recommend the use of MFs in ssGBLUP for genomic evaluations in crossbreeding schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
105. Genomic prediction of survival time in a population of brown laying hens showing cannibalistic behavior
- Author
-
Alemu, Setegn W., primary, Calus, Mario P. L., additional, Muir, William M., additional, Peeters, Katrijn, additional, Vereijken, Addie, additional, and Bijma, Piter, additional
- Published
- 2016
- Full Text
- View/download PDF
106. Assigning breed origin to alleles in crossbred animals
- Author
-
Vandenplas, Jérémie, primary, Calus, Mario P. L., additional, Sevillano, Claudia A., additional, Windig, Jack J., additional, and Bastiaansen, John W. M., additional
- Published
- 2016
- Full Text
- View/download PDF
107. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships
- Author
-
Zhang, Qianqian, primary, Guldbrandtsen, Bernt, additional, Calus, Mario P. L., additional, Lund, Mogens Sandø, additional, and Sahana, Goutam, additional
- Published
- 2016
- Full Text
- View/download PDF
108. Empirical determination of breed-of-origin of alleles in three-breed cross pigs
- Author
-
Sevillano, Claudia A., primary, Vandenplas, Jeremie, additional, Bastiaansen, John W. M., additional, and Calus, Mario P. L., additional
- Published
- 2016
- Full Text
- View/download PDF
109. Efficient genomic prediction based on whole-genome sequence data using split-and-merge Bayesian variable selection
- Author
-
Calus, Mario P. L., primary, Bouwman, Aniek C., additional, Schrooten, Chris, additional, and Veerkamp, Roel F., additional
- Published
- 2016
- Full Text
- View/download PDF
110. Whole-genome sequence data uncover loss of genetic diversity due to selection
- Author
-
Eynard, Sonia E., primary, Windig, Jack J., additional, Hiemstra, Sipke J., additional, and Calus, Mario P. L., additional
- Published
- 2016
- Full Text
- View/download PDF
111. An Equation to Predict the Accuracy of Genomic Values by Combining Data from Multiple Traits, Populations, or Environments
- Author
-
Wientjes, Yvonne C J, primary, Bijma, Piter, additional, Veerkamp, Roel F, additional, and Calus, Mario P L, additional
- Published
- 2015
- Full Text
- View/download PDF
112. Human-Mediated Introgression of Haplotypes in a Modern Dairy Cattle Breed.
- Author
-
Qianqian Zhang, Calus, Mario P. L., Bosse, Mirte, Sahana, Goutam, Lund, Mogens Sandø, and Guldbrandtsen, Bernt
- Subjects
- *
CATTLE , *FERTILITY , *GENOMES , *MILK , *NUCLEIC acid hybridization , *DNA-binding proteins , *HAPLOTYPES - Abstract
Domestic animals can serve as model systems of adaptive introgression and their genomic signatures. In part, their usefulness as model systems is due to their well-known histories. Different breeding strategies such as introgression and artificial selection have generated numerous desirable phenotypes and superior performance in domestic animals. The modern Danish Red Dairy Cattle is studied as an example of an introgressed population. It originates from crossing the traditional Danish Red Dairy Cattle with the Holstein and Brown Swiss breeds, both known for high milk production. This crossing happened, among other things due to changes in the production system, to raise milk production and overall performance. The genomes of modern Danish Red Dairy Cattle are heavily influenced by regions introgressed from the Holstein and Brown Swiss breeds and under subsequent selection in the admixed population. The introgressed proportion of the genome was found to be highly variable across the genome. Haplotypes introgressed from Holstein and Brown Swiss contained or overlapped known genes affecting milk production, as well as protein and fat content (CD14, ZNF215, BCL2L12, and THRSP for Holstein origin and ITPR2, BCAT1, LAP3, and MED28 for Brown Swiss origin). Genomic regions with high introgression signals also contained genes and enriched QTL associated with calving traits, body confirmation, feed efficiency, carcass, and fertility traits. These introgressed signals with relative identity-by-descent scores larger than the median showing Holstein or Brown Swiss introgression are mostly significantly correlated with the corresponding test statistics from signatures of selection analyses in modern Danish Red Dairy Cattle. Meanwhile, the putative significant introgressed signals have a significant dependency with the putative significant signals from signatures of selection analyses. Artificial selection has played an important role in the genomic footprints of introgression in the genome of modern Danish Red Dairy Cattle. Our study on a modern cattle breed contributes to an understanding of genomic consequences of selective introgression by demonstrating the extent to which adaptive effects contribute to shape the specific genomic consequences of introgression. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
113. Sparse single-step genomic BLUP in crossbreeding schemes2.
- Author
-
Vandenplas, Jérémie, Calus, Mario P L, and ten Napel, Jan
- Subjects
- *
CROSSBREEDING , *GENOTYPES , *SWINE , *CHICKENS , *ANIMAL breeding , *EIGENVALUES - Abstract
The algorithm for proven and young animals (
APY ) efficiently computes an approximated inverse of the genomic relationship matrix, by dividing genotyped animals in the so-called core and noncore animals. The APY leads to computationally feasible single-step genomic Best Linear Unbiased Prediction (ssGBLUP ) with a large number of genotyped animals and was successfully applied to real single-breed or line datasets. This study aimed to assess the quality of genomic estimated breeding values (GEBV ) when using the APY (GEBVAPY), in comparison to GEBV when using the directly inverted genomic relationship matrix (GEBVDIRECT), for situations based on crossbreeding schemes, including F1 and F2 crosses, such as the ones for pigs and chickens. Based on simulations of a 3-way crossbreeding program, we compared different approximated inverses of a genomic relationship matrix, by varying the size and the composition of the core group. We showed that GEBVAPY were accurate approximations of GEBVDIRECT for multivariate ssGBLUP involving different breeds and their crosses. GEBVAPY as accurate as GEBVDIRECT were obtained when the core groups included animals from different breed compositions and when the core groups had a size between the numbers of the largest eigenvalues explaining 98% and 99% of the variation in the raw genomic relationship matrix. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
114. Sparse single-step genomic BLUP in crossbreeding schemes2.
- Author
-
Vandenplas, Jérémie, Calus, Mario P L, and ten Napel, Jan
- Subjects
CROSSBREEDING ,GENOTYPES ,SWINE ,CHICKENS ,ANIMAL breeding ,EIGENVALUES - Abstract
The algorithm for proven and young animals (
APY ) efficiently computes an approximated inverse of the genomic relationship matrix, by dividing genotyped animals in the so-called core and noncore animals. The APY leads to computationally feasible single-step genomic Best Linear Unbiased Prediction (ssGBLUP ) with a large number of genotyped animals and was successfully applied to real single-breed or line datasets. This study aimed to assess the quality of genomic estimated breeding values (GEBV ) when using the APY (GEBVAPY ), in comparison to GEBV when using the directly inverted genomic relationship matrix (GEBVDIRECT ), for situations based on crossbreeding schemes, including F1 and F2 crosses, such as the ones for pigs and chickens. Based on simulations of a 3-way crossbreeding program, we compared different approximated inverses of a genomic relationship matrix, by varying the size and the composition of the core group. We showed that GEBVAPY were accurate approximations of GEBVDIRECT for multivariate ssGBLUP involving different breeds and their crosses. GEBVAPY as accurate as GEBVDIRECT were obtained when the core groups included animals from different breed compositions and when the core groups had a size between the numbers of the largest eigenvalues explaining 98% and 99% of the variation in the raw genomic relationship matrix. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
115. Genomic prediction for crossbred performance using metafounders.
- Author
-
van Grevenhof, Elizabeth M, Vandenplas, Jérémie, and Calus, Mario P L
- Abstract
Future genomic evaluation models to be used routinely in breeding programs for pigs and poultry need to be able to optimally use information of crossbred (CB) animals to predict breeding values for CB performance of purebred (PB) selection candidates. Important challenges in the commonly used single-step genomic best linear unbiased prediction (ssGBLUP) model are the definition of relationships between the different line compositions and the definition of the base generation per line. The use of metafounders (MFs) in ssGBLUP has been proposed to overcome these issues. When relationships between lines are known to be different from 0, the use of MFs generalizes the concept of genetic groups relying on the genotype data. Our objective was to investigate the effect of using MFs in genomic prediction for CB performance on estimated variance components, and accuracy and bias of GEBV. This was studied using stochastic simulation to generate data representing a three-way crossbreeding scheme in pigs, with the parental lines being either closely related or unrelated. Results show that using MFs, the variance components should be scaled appropriately, especially when basing them on estimates obtained with, for example a pedigree-based model. The accuracies of GEBV that were obtained using MFs were similar to accuracies without using MFs, regardless whether the lines involved in the CB were closely related or unrelated. The use of MFs resulted in a model that had similar or somewhat better convergence properties compared to other models. We recommend the use of MFs in ssGBLUP for genomic evaluations in crossbreeding schemes.
- Published
- 2019
- Full Text
- View/download PDF
116. Genomic prediction using imputed whole-genome sequence data in Holstein Friesian cattle
- Author
-
van Binsbergen, Rianne, primary, Calus, Mario P. L., additional, Bink, Marco C. A. M., additional, van Eeuwijk, Fred A., additional, Schrooten, Chris, additional, and Veerkamp, Roel F., additional
- Published
- 2015
- Full Text
- View/download PDF
117. Accuracy of imputation using the most common sires as reference population in layer chickens
- Author
-
Heidaritabar, Marzieh, primary, Calus, Mario P. L., additional, Vereijken, Addie, additional, Groenen, Martien A. M., additional, and Bastiaansen, John W. M., additional
- Published
- 2015
- Full Text
- View/download PDF
118. Contribution of rare and low-frequency whole-genome sequence variants to complex traits variation in dairy cattle.
- Author
-
Qianqian Zhang, Calus, Mario P. L., Guldbrandtsen, Bernt, Lund, Mogens Sandø, and Sahana, Goutam
- Subjects
GENOMES ,CATTLE genetics ,HOLSTEIN-Friesian cattle ,ANALYSIS of variance ,GENOMICS - Abstract
Background: Whole-genome sequencing and imputation methodologies have enabled the study of the effects of genomic variants with low to very low minor allele frequency (MAF) on variation in complex traits. Our objective was to estimate the proportion of variance explained by imputed sequence variants classified according to their MAF compared with the variance explained by the pedigree-based additive genetic relationship matrix for 17 traits in Nordic Holstein dairy cattle. Results: Imputed sequence variants were grouped into seven classes according to their MAF (0.001-0.01, 0.01-0.05, 0.05-0.1, 0.1-0.2, 0.2-0.3, 0.3-0.4 and 0.4-0.5). The total contribution of all imputed sequence variants to variance in deregressed estimated breeding values or proofs (DRP) for different traits ranged from 0.41 [standard error (SE) = 0.026] for temperament to 0.87 (SE = 0.011) for milk yield. The contribution of rare variants (MAF < 0.01) to the total DRP variance explained by all imputed sequence variants was relatively small (a maximum of 12.5% for the health index). Rare and low-frequency variants (MAF < 0.05) contributed a larger proportion of the explained DRP variances (>13%) for health-related traits than for production traits (<11%). However, a substantial proportion of these variance estimates across different MAF classes had large SE, especially when the variance explained by a MAF class was small. The proportion of DRP variance that was explained by all imputed whole-genome sequence variants improved slightly compared with variance explained by the 50 k Illumina markers, which are routinely used in bovine genomic prediction. However, the proportion of DRP variance explained by imputed sequence variants was lower than that explained by pedigree relationships, ranging from 1.5% for milk yield to 37.9% for the health index. Conclusions: Imputed sequence variants explained more of the variance in DRP than the 50 k markers for most traits, but explained less variance than that captured by pedigree-based relationships. Although in humans partitioning variants into groups based on MAF and linkage disequilibrium was used to estimate heritability without bias, many of our bovine estimates had a high SE. For a reliable estimate of the explained DRP variance for different MAF classes, larger sample sizes are needed. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
119. The Importance of Endophenotypes to Evaluate the Relationship between Genotype and External Phenotype.
- Author
-
te Pas, Marinus F. W., Madsen, Ole, Calus, Mario P. L., and Smits, Mari A.
- Subjects
PROTEOMICS ,METABOLOMICS ,BIOINFORMATICS ,SYSTEMS biology ,MENDEL'S law ,ANIMAL genetics - Abstract
With the exception of a few Mendelian traits, almost all phenotypes (traits) in livestock science are quantitative or complex traits regulated by the expression of many genes. For most of the complex traits, differential expression of genes, rather than genomic variation in the gene coding sequences, is associated with the genotype of a trait. The expression profiles of the animal's transcriptome, proteome and metabolome represent endophenotypes that influence/regulate the externally-observed phenotype. These expression profiles are generated by interactions between the animal's genome and its environment that range from the cellular, up to the husbandry environment. Thus, understanding complex traits requires knowledge about not only genomic variation, but also environmental effects that affect genome expression. Gene products act together in physiological pathways and interaction networks (of pathways). Due to the lack of annotation of the functional genome and ontologies of genes, our knowledge about the various biological systems that contribute to the development of external phenotypes is sparse. Furthermore, interaction with the animals' microbiome, especially in the gut, greatly influences the external phenotype. We conclude that a detailed understanding of complex traits requires not only understanding of variation in the genome, but also its expression at all functional levels. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
120. Plasma Proteome Profiles Associated with Diet-Induced Metabolic Syndrome and the Early Onset of Metabolic Syndrome in a Pig Model
- Author
-
te Pas, Marinus F. W., primary, Koopmans, Sietse-Jan, additional, Kruijt, Leo, additional, Calus, Mario P. L., additional, and Smits, Mari A., additional
- Published
- 2013
- Full Text
- View/download PDF
121. Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking
- Author
-
Daetwyler, Hans D, primary, Calus, Mario P L, additional, Pong-Wong, Ricardo, additional, de los Campos, Gustavo, additional, and Hickey, John M, additional
- Published
- 2013
- Full Text
- View/download PDF
122. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding
- Author
-
de los Campos, Gustavo, primary, Hickey, John M, additional, Pong-Wong, Ricardo, additional, Daetwyler, Hans D, additional, and Calus, Mario P L, additional
- Published
- 2013
- Full Text
- View/download PDF
123. The Effect of Linkage Disequilibrium and Family Relationships on the Reliability of Genomic Prediction
- Author
-
Wientjes, Yvonne C J, primary, Veerkamp, Roel F, additional, and Calus, Mario P L, additional
- Published
- 2013
- Full Text
- View/download PDF
124. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships.
- Author
-
Qianqian Zhang, Guldbrandtsen, Bernt, Calus, Mario P. L., Lund, Mogens Sandø, and Sahana, Goutam
- Subjects
CATTLE genetics ,GENE expression ,CATTLE genome mapping ,GENETIC polymorphisms ,CATTLE population genetics ,CATTLE - Abstract
Background: There is growing interest in the role of rare variants in the variation of complex traits due to increasing evidence that rare variants are associated with quantitative traits. However, association methods that are commonly used for mapping common variants are not effective to map rare variants. Besides, livestock populations have large half-sib families and the occurrence of rare variants may be confounded with family structure, which makes it difficult to disentangle their effects from family mean effects. We compared the power of methods that are commonly applied in human genetics to map rare variants in cattle using whole-genome sequence data and simulated phenotypes. We also studied the power of mapping rare variants using linear mixed models (LMM), which are the method of choice to account for both family relationships and population structure in cattle. Results: We observed that the power of the LMM approach was low for mapping a rare variant (defined as those that have frequencies lower than 0.01) with a moderate effect (5 to 8 % of phenotypic variance explained by multiple rare variants that vary from 5 to 21 in number) contributing to a QTL with a sample size of 1000. In contrast, across the scenarios studied, statistical methods that are specialized for mapping rare variants increased power regardless of whether multiple rare variants or a single rare variant underlie a QTL. Different methods for combining rare variants in the test single nucleotide polymorphism set resulted in similar power irrespective of the proportion of total genetic variance explained by the QTL. However, when the QTL variance is very small (only 0.1 % of the total genetic variance), these specialized methods for mapping rare variants and LMM generally had no power to map the variants within a gene with sample sizes of 1000 or 5000. Conclusions: We observed that the methods that combine multiple rare variants within a gene into a meta-variant generally had greater power to map rare variants compared to LMM. Therefore, it is recommended to use rare variant association mapping methods to map rare genetic variants that affect quantitative traits in livestock, such as bovine populations. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
125. Partial least square regression applied to the QTLMAS 2010 dataset
- Author
-
Coster, Albart, primary and Calus, Mario P L, additional
- Published
- 2011
- Full Text
- View/download PDF
126. QTLMAS 2009: simulated dataset
- Author
-
Coster, Albart, primary, Bastiaansen, John W M, additional, Calus, Mario P L, additional, Maliepaard, Chris, additional, and Bink, Marco C A M, additional
- Published
- 2010
- Full Text
- View/download PDF
127. Comparison of analyses of the QTLMAS XIII common dataset. II: QTL analysis
- Author
-
Maliepaard, Chris, primary, Bastiaansen, John W M, additional, Calus, Mario P L, additional, Coster, Albart, additional, and Bink, Marco C A M, additional
- Published
- 2010
- Full Text
- View/download PDF
128. Comparison of analyses of the QTLMAS XIII common dataset. I: genomic selection
- Author
-
Bastiaansen, John W M, primary, Bink, Marco C A M, additional, Coster, Albart, additional, Maliepaard, Chris, additional, and Calus, Mario P L, additional
- Published
- 2010
- Full Text
- View/download PDF
129. Simultaneous QTL detection and genomic breeding value estimation using high density SNP chips
- Author
-
Veerkamp, Roel F, primary, Verbyla, Klara L, additional, Mulder, Han A, additional, and Calus, Mario P L, additional
- Published
- 2010
- Full Text
- View/download PDF
130. Estimation of inbreeding using pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds.
- Author
-
Qianqian Zhang, Calus, Mario P. L., Guldbrandtsen, Bernt, Lund, Mogens S., and Sahana, Goutam
- Subjects
- *
CATTLE breeds , *CATTLE breeding , *CATTLE pedigrees , *SINGLE nucleotide polymorphisms , *GENOTYPES , *HOMOZYGOSITY , *GENETICS - Abstract
Background: Levels of inbreeding in cattle populations have increased in the past due to the use of a limited number of bulls for artificial insemination. High levels of inbreeding lead to reduced genetic diversity and inbreeding depression. Various estimators based on different sources, e.g., pedigree or genomic data, have been used to estimate inbreeding coefficients in cattle populations. However, the comparative advantage of using full sequence data to assess inbreeding is unknown. We used pedigree and genomic data at different densities from 50k to full sequence variants to compare how different methods performed for the estimation of inbreeding levels in three different cattle breeds. Results: Five different estimates for inbreeding were calculated and compared in this study: pedigree based inbreeding coefficient (FPED); run of homozygosity (ROH)-based inbreeding coefficients (FROH); genomic relationship matrix (GRM)-based inbreeding coefficients (FGRM); inbreeding coefficients based on excess of homozygosity (FHOM) and correlation of uniting gametes (FUNI). Estimates using ROH provided the direct estimated levels of autozygosity in the current populations and are free effects of allele frequencies and incomplete pedigrees which may increase in inaccuracy in estimation of inbreeding. The highest correlations were observed between FROH estimated from the full sequence variants and the FROH estimated from 50k SNP (single nucleotide polymorphism) genotypes. The estimator based on the correlation between uniting gametes (FUNI) using full genome sequences was also strongly correlated with FROH detected from sequence data. Conclusions: Estimates based on ROH directly reflected levels of homozygosity and were not influenced by allele frequencies, unlike the three other estimates evaluated (FGRM, FHOM and FUNI), which depended on estimated allele frequencies. FPED suffered from limited pedigree depth. Marker density affects ROH estimation. Detecting ROH based on 50k chip data was observed to give estimates similar to ROH from sequence data. In the absence of full sequence data ROH based on 50k can be used to access homozygosity levels in individuals. However, genotypes denser than 50k are required to accurately detect short ROH that are most likely identical by descent (IBD). [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
131. Using selection index theory to estimate consistency of multi-locus linkage disequilibrium across populations.
- Author
-
Wientjes, Yvonne C. J., Veerkamp, Roel F., and Calus, Mario P. L.
- Subjects
SINGLE nucleotide polymorphisms ,GENETIC research ,GENOTYPES ,LINKAGE disequilibrium ,SELECTION indexes (Animal breeding) - Abstract
Background: The potential of combining multiple populations in genomic prediction is depending on the consistency of linkage disequilibrium (LD) between SNPs and QTL across populations. We investigated consistency of multi-locus LD across populations using selection index theory and investigated the relationship between consistency of multi-locus LD and accuracy of genomic prediction across different simulated scenarios. In the selection index, QTL genotypes were considered as breeding goal traits and SNP genotypes as index traits, based on LD among SNPs and between SNPs and QTL. The consistency of multi-locus LD across populations was computed as the accuracy of predicting QTL genotypes in selection candidates using a selection index derived in the reference population. Different scenarios of within and across population genomic prediction were evaluated, using all SNPs or only the four neighboring SNPs of a simulated QTL. Phenotypes were simulated using different numbers of QTL underlying the trait. The relationship between the calculated consistency of multi-locus LD and accuracy of genomic prediction using a GBLUP type of model was investigated. Results: The accuracy of predicting QTL genotypes, i.e. the measure describing consistency of multi-locus LD, was much lower for across population scenarios compared to within population scenarios, and was lower when QTL had a low MAF compared to QTL randomly selected from the SNPs. Consistency of multi-locus LD was highly correlated with the realized accuracy of genomic prediction across different scenarios and the correlation was higher when QTL were weighted according to their effects in the selection index instead of weighting QTL equally. By only considering neighboring SNPs of QTL, accuracy of predicting QTL genotypes within population decreased, but it substantially increased the accuracy across populations. Conclusions: Consistency of multi-locus LD across populations is a characteristic of the properties of the QTL in the investigated populations and can provide more insight in underlying reasons for a low empirical accuracy of across population genomic prediction. By focusing in genomic prediction models only on neighboring SNPs of QTL, multi-locus LD is more consistent across populations since only short-range LD is considered, and accuracy of predicting QTL genotypes of individuals from another population is increased. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
132. Impact of QTL properties on the accuracy of multi-breed genomic prediction.
- Author
-
Wientjes, Yvonne C. J., Calus, Mario P. L., Goddard, Michael E., and Hayes, Ben J.
- Subjects
GENOMES ,GENETIC algorithms ,FORECASTING ,GENETICS ,ANIMAL genetics - Abstract
Background: Although simulation studies show that combining multiple breeds in one reference population increases accuracy of genomic prediction, this is not always confirmed in empirical studies. This discrepancy might be due to the assumptions on quantitative trait loci (QTL) properties applied in simulation studies, including number of QTL, spectrum of QTL allele frequencies across breeds, and distribution of allele substitution effects. We investigated the effects of QTL properties and of including a random across- and within-breed animal effect in a genomic best linear unbiased prediction (GBLUP) model on accuracy of multi-breed genomic prediction using genotypes of Holstein-Friesian and Jersey cows. Methods: Genotypes of three classes of variants obtained from whole-genome sequence data, with moderately low, very low or extremely low average minor allele frequencies (MAF), were imputed in 3000 Holstein-Friesian and 3000 Jersey cows that had real high-density genotypes. Phenotypes of traits controlled by QTL with different properties were simulated by sampling 100 or 1000 QTL from one class of variants and their allele substitution effects either randomly from a gamma distribution, or computed such that each QTL explained the same variance, i.e. rare alleles had a large effect. Genomic breeding values for 1000 selection candidates per breed were estimated using GBLUP models including a random across- and a within-breed animal effect. Results: For all three classes of QTL allele frequency spectra, accuracies of genomic prediction were not affected by the addition of 2000 individuals of the other breed to a reference population of the same breed as the selection candidates. Accuracies of both single- and multi-breed genomic prediction decreased as MAF of QTL decreased, especially when rare alleles had a large effect. Accuracies of genomic prediction were similar for the models with and without a random within-breed animal effect, probably because of insufficient power to separate across- and within-breed animal effects. Conclusions: Accuracy of both single- and multi-breed genomic prediction depends on the properties of the QTL that underlie the trait. As QTL MAF decreased, accuracy decreased, especially when rare alleles had a large effect. This demonstrates that QTL properties are key parameters that determine the accuracy of genomic prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
133. The effect of rare alleles on estimated genomic relationships from whole genome sequence data.
- Author
-
Eynard, Sonia E., Windig, Jack J., Leroy, Grégoire, van Binsbergen, Rianne, and Calus, Mario P. L.
- Subjects
HOLSTEIN-Friesian cattle ,SINGLE nucleotide polymorphisms ,CATTLE pedigrees ,ALLELES ,DAIRY cattle breeds ,REPRODUCTION - Abstract
Background: Relationships between individuals and inbreeding coefficients are commonly used for breeding decisions, but may be affected by the type of data used for their estimation. The proportion of variants with low Minor Allele Frequency (MAF) is larger in whole genome sequence (WGS) data compared to Single Nucleotide Polymorphism (SNP) chips. Therefore, WGS data provide true relationships between individuals and may influence breeding decisions and prioritisation for conservation of genetic diversity in livestock. This study identifies differences between relationships and inbreeding coefficients estimated using pedigree, SNP or WGS data for 118 Holstein bulls from the 1000 Bull genomes project. To determine the impact of rare alleles on the estimates we compared three scenarios of MAF restrictions: variants with a MAF higher than 5%, variants with a MAF higher than 1% and variants with a MAF between 1% and 5%. Results: We observed significant differences between estimated relationships and, although less significantly, inbreeding coefficients from pedigree, SNP or WGS data, and between MAF restriction scenarios. Computed correlations between pedigree and genomic relationships, within groups with similar relationships, ranged from negative to moderate for both estimated relationships and inbreeding coefficients, but were high between estimates from SNP and WGS (0.49 to 0.99). Estimated relationships from genomic information exhibited higher variation than from pedigree. Inbreeding coefficients analysis showed that more complete pedigree records lead to higher correlation between inbreeding coefficients from pedigree and genomic data. Finally, estimates and correlations between additive genetic (A) and genomic (G) relationship matrices were lower, and variances of the relationships were larger when accounting for allele frequencies than without accounting for allele frequencies. Conclusions: Using pedigree data or genomic information, and including or excluding variants with a MAF below 5% showed significant differences in relationship and inbreeding coefficient estimates. Estimated relationships and inbreeding coefficients are the basis for selection decisions. Therefore, it can be expected that using WGS instead of SNP can affect selection decision. Inclusion of rare variants will give access to the variation they carry, which is of interest for conservation of genetic diversity. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
134. Empirical and deterministic accuracies of across-population genomic prediction.
- Author
-
Wientjes, Yvonne C. J., Veerkamp, Roel F., Bijma, Piter, Bovenhuis, Henk, Schrooten, Chris, and Calus, Mario P. L.
- Subjects
CATTLE genetics ,CATTLE breeding ,ALLELES ,COWS ,GENOMICS - Abstract
Background: Differences in linkage disequilibrium and in allele substitution effects of QTL (quantitative trait loci) may hinder genomic prediction across populations. Our objective was to develop a deterministic formula to estimate the accuracy of across-population genomic prediction, for which reference individuals and selection candidates are from different populations, and to investigate the impact of differences in allele substitution effects across populations and of the number of QTL underlying a trait on the accuracy. Methods: A deterministic formula to estimate the accuracy of across-population genomic prediction was derived based on selection index theory. Moreover, accuracies were deterministically predicted using a formula based on population parameters and empirically calculated using simulated phenotypes and a GBLUP (genomic best linear unbiased prediction) model. Phenotypes of 1033 Holstein-Friesian, 105 Groninger White Headed and 147 Meuse-Rhine-Yssel cows were simulated by sampling 3000, 300, 30 or 3 QTL from the available high-density SNP (single nucleotide polymorphism) information of three chromosomes, assuming a correlation of 1.0, 0.8, 0.6, 0.4, or 0.2 between allele substitution effects across breeds. The simulated heritability was set to 0.95 to resemble the heritability of deregressed proofs of bulls. Results: Accuracies estimated with the deterministic formula based on selection index theory were similar to empirical accuracies for all scenarios, while accuracies predicted with the formula based on population parameters overestimated empirical accuracies by ∼25 to 30%. When the between-breed genetic correlation differed from 1, i.e. allele substitution effects differed across breeds, empirical and deterministic accuracies decreased in proportion to the genetic correlation. Using a multi-trait model, it was possible to accurately estimate the genetic correlation between the breeds based on phenotypes and high-density genotypes. The number of QTL underlying the simulated trait did not affect the accuracy. Conclusions: The deterministic formula based on selection index theory estimated the accuracy of across-population genomic predictions well. The deterministic formula using population parameters overestimated the across-population genomic accuracy, but may still be useful because of its simplicity. Both formulas could accommodate for genetic correlations between populations lower than 1. The number of QTL underlying a trait did not affect the accuracy of across-population genomic prediction using a GBLUP method. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
135. Genomic prediction based on data from three layer lines using non-linear regression models.
- Author
-
Heyun Huang, Windig, Jack J., Vereijken, Addie, and Calus, Mario P. L.
- Subjects
GENOMICS ,NONLINEAR regression ,LINEAR statistical models ,PHENOTYPES ,DATA distribution ,STATISTICAL correlation - Abstract
Background: Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. Methods: In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. Results: When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Conclusions: Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
136. A comparison of principal component regression and genomic REML for genomic prediction across populations.
- Author
-
Dadousis, Christos, Veerkamp, Roel F., Heringstad, Bjørg, Pszczola, Marcin, and Calus, Mario P. L.
- Subjects
GENOMICS ,ANIMAL population density ,SINGLE nucleotide polymorphisms ,MULTICOLLINEARITY ,MULTIVARIATE analysis ,PRINCIPAL components analysis ,POLYMERASE chain reaction - Abstract
Background: Genomic prediction faces two main statistical problems: multicollinearity and n⪡p (many fewer observations than predictor variables). Principal component (PC) analysis is a multivariate statistical method that is often used to address these problems. The objective of this study was to compare the performance of PC regression (PCR) for genomic prediction with that of a commonly used REML model with a genomic relationship matrix (GREML) and to investigate the full potential of PCR for genomic prediction. Methods: The PCR model used either a common or a semi-supervised approach, where PC were selected based either on their eigenvalues (i.e. proportion of variance explained by SNP (single nucleotide polymorphism) genotypes) or on their association with phenotypic variance in the reference population (i.e. the regression sum of squares contribution). Cross-validation within the reference population was used to select the optimum PCR model that minimizes mean squared error. Pre-corrected average daily milk, fat and protein yields of 1609 first lactation Holstein heifers, from Ireland, UK, the Netherlands and Sweden, which were genotyped with 50 k SNPs, were analysed. Each testing subset included animals from only one country, or from only one selection line for the UK. Results: In general, accuracies of GREML and PCR were similar but GREML slightly outperformed PCR. Inclusion of genotyping information of validation animals into model training (semi-supervised PCR), did not result in more accurate genomic predictions. The highest achievable PCR accuracies were obtained across a wide range of numbers of PC fitted in the regression (from one to more than 1000), across test populations and traits. Using cross-validation within the reference population to derive the number of PC, yielded substantially lower accuracies than the highest achievable accuracies obtained across all possible numbers of PC. Conclusions: On average, PCR performed only slightly less well than GREML. When the optimal number of PC was determined based on realized accuracy in the testing population, PCR showed a higher potential in terms of achievable accuracy that was not capitalized when PC selection was based on cross-validation. A standard approach for selecting the optimal set of PC in PCR remains a challenge. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
137. Genomic prediction based on data from three layer lines: a comparison between linear methods.
- Author
-
Calus, Mario P. L., Heyun Huang, Vereijken, Addie, Visscher, Jeroen, Napel, Jan ten, and Windig, Jack J.
- Subjects
GENOMICS ,LINEAR systems ,RIDGE regression (Statistics) ,PRINCIPAL components analysis ,HERITABILITY - Abstract
Background: The prediction accuracy of several linear genomic prediction models, which have previously been used for within-line genomic prediction, was evaluated for multi-line genomic prediction. Methods: Compared to a conventional BLUP (best linear unbiased prediction) model using pedigree data, we evaluated the following genomic prediction models: genome-enabled BLUP (GBLUP), ridge regression BLUP (RRBLUP), principal component analysis followed by ridge regression (RRPCA), BayesC and Bayesian stochastic search variable selection. Prediction accuracy was measured as the correlation between predicted breeding values and observed phenotypes divided by the square root of the heritability. The data used concerned laying hens with phenotypes for number of eggs in the first production period and known genotypes. The hens were from two closely-related brown layer lines (B1 and B2), and a third distantly-related white layer line (W1). Lines had 1004 to 1023 training animals and 238 to 240 validation animals. Training datasets consisted of animals of either single lines, or a combination of two or all three lines, and had 30 508 to 45 974 segregating single nucleotide polymorphisms. Results: Genomic prediction models yielded 0.13 to 0.16 higher accuracies than pedigree-based BLUP. When excluding the line itself from the training dataset, genomic predictions were generally inaccurate. Use of multiple lines marginally improved prediction accuracy for B2 but did not affect or slightly decreased prediction accuracy for B1 and W1. Differences between models were generally small except for RRPCA which gave considerably higher accuracies for B2. Correlations between genomic predictions from different methods were higher than 0.96 for W1 and higher than 0.88 for B1 and B2. The greater differences between methods for B1 and B2 were probably due to the lower accuracy of predictions for B1 (∼0.45) and B2 (∼0.40) compared to W1 (∼0.76). Conclusions: Multi-line genomic prediction did not affect or slightly improved prediction accuracy for closely-related lines. For distantly-related lines, multi-line genomic prediction yielded similar or slightly lower accuracies than single-line genomic prediction. Bayesian variable selection and GBLUP generally gave similar accuracies. Overall, RRPCA yielded the greatest accuracies for two lines, suggesting that using PCA helps to alleviate the "n⪡p" problem in genomic prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
138. Genomic prediction of breeding values using previously estimated SNP variances.
- Author
-
Calus, Mario P. L., Schrooten, Chris, and Veerkamp, Roel F.
- Subjects
SINGLE nucleotide polymorphisms ,BREEDING ,GENOMICS ,MOLECULAR genetics ,BAYES' estimation - Abstract
Background: Genomic prediction requires estimation of variances of effects of single nucleotide polymorphisms (SNPs), which is computationally demanding, and uses these variances for prediction. We have developed models with separate estimation of SNP variances, which can be applied infrequently, and genomic prediction, which can be applied routinely. Methods: SNP variances were estimated with Bayes Stochastic Search Variable Selection (BSSVS) and BayesC. Genome-enhanced breeding values (GEBV) were estimated with RR-BLUP (ridge regression best linear unbiased prediction), using either variances obtained from BSSVS (BLUP-SSVS) or BayesC (BLUP-C), or assuming equal variances for each SNP. Datasets used to estimate SNP variances comprised (1) all animals, (2) 50% random animals (RAN50), (3) 50% best animals (TOP50), or (4) 50% worst animals (BOT50). Traits analysed were protein yield, udder depth, somatic cell score, interval between first and last insemination, direct longevity, and longevity including information from predictors. Results: BLUP-SSVS and BLUP-C yielded similar GEBV as the equivalent Bayesian models that simultaneously estimated SNP variances. Reliabilities of these GEBV were consistently higher than from RR-BLUP, although only significantly for direct longevity. Across scenarios that used data subsets to estimate GEBV, observed reliabilities were generally higher for TOP50 than for RAN50, and much higher than for BOT50. Reliabilities of TOP50 were higher because the training data contained more ancestors of selection candidates. Using estimated SNP variances based on random or non-random subsets of the data, while using all data to estimate GEBV, did not affect reliabilities of the BLUP models. A convergence criterion of 10
-8 instead of 10-10 for BLUP models yielded similar GEBV, while the required number of iterations decreased by 71 to 90%. Including a separate polygenic effect consistently improved reliabilities of the GEBV, but also substantially increased the required number of iterations to reach convergence with RR-BLUP. SNP variances converged faster for BayesC than for BSSVS. Conclusions: Combining Bayesian variable selection models to re-estimate SNP variances and BLUP models that use those SNP variances, yields GEBV that are similar to those from full Bayesian models. Moreover, these combined models yield predictions with higher reliability and less bias than the commonly used RR-BLUP model. [ABSTRACT FROM AUTHOR]- Published
- 2014
- Full Text
- View/download PDF
139. Accuracy of imputation to whole-genome sequence data in Holstein Friesian cattle.
- Author
-
van Binsbergen, Rianne, Bink, Marco C. A. M., Calus, Mario P. L., van Eeuwijk, Fred A., Hayes, Ben J., Hulsegge, Ina, and Veerkamp, Roel F.
- Subjects
GENOMES ,NUCLEOTIDE sequence ,HOLSTEIN-Friesian cattle ,ANIMAL genetics ,CHROMOSOMES - Abstract
Background: The use of whole-genome sequence data can lead to higher accuracy in genome-wide association studies and genomic predictions. However, to benefit from whole-genome sequence data, a large dataset of sequenced individuals is needed. Imputation from SNP panels, such as the Illumina BovineSNP50 BeadChip and Illumina BovineHD BeadChip, to whole-genome sequence data is an attractive and less expensive approach to obtain whole-genome sequence genotypes for a large number of individuals than sequencing all individuals. Our objective was to investigate accuracy of imputation from lower density SNP panels to whole-genome sequence data in a typical dataset for cattle. Methods: Whole-genome sequence data of chromosome 1 (1737 471 SNPs) for 114 Holstein Friesian bulls were used. Beagle software was used for imputation from the BovineSNP50 (3132 SNPs) and BovineHD (40 492 SNPs) beadchips. Accuracy was calculated as the correlation between observed and imputed genotypes and assessed by five-fold cross-validation. Three scenarios S40, S60 and S80 with respectively 40%, 60%, and 80% of the individuals as reference individuals were investigated. Results: Mean accuracies of imputation per SNP from the BovineHD panel to sequence data and from the BovineSNP50 panel to sequence data for scenarios S40 and S80 ranged from 0.77 to 0.83 and from 0.37 to 0.46, respectively. Stepwise imputation from the BovineSNP50 to BovineHD panel and then to sequence data for scenario S40 improved accuracy per SNP to 0.65 but it varied considerably between SNPs. Conclusions: Accuracy of imputation to whole-genome sequence data was generally high for imputation from the BovineHD beadchip, but was low from the BovineSNP50 beadchip. Stepwise imputation from the BovineSNP50 to the BovineHD beadchip and then to sequence data substantially improved accuracy of imputation. SNPs with a low minor allele frequency were more difficult to impute correctly and the reliability of imputation varied more. Linkage disequilibrium between an imputed SNP and the SNP on the lower density panel, minor allele frequency of the imputed SNP and size of the reference group affected imputation reliability. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
140. Right-hand-side updating for fast computing of genomic breeding values.
- Author
-
Calus, Mario P. L.
- Subjects
SINGLE nucleotide polymorphisms ,GENOMES ,ALGORITHMS ,PREDICTION models ,REGRESSION analysis - Abstract
Background: Since both the number of SNPs (single nucleotide polymorphisms) used in genomic prediction and the number of individuals used in training datasets are rapidly increasing, there is an increasing need to improve the efficiency of genomic prediction models in terms of computing time and memory (RAM) required. Methods: In this paper, two alternative algorithms for genomic prediction are presented that replace the originally suggested residual updating algorithm, without affecting the estimates. The first alternative algorithm continues to use residual updating, but takes advantage of the characteristic that the predictor variables in the model (i.e. the SNP genotypes) take only three different values, and is therefore termed "improved residual updating". The second alternative algorithm, here termed "right-hand-side updating" (RHS-updating), extends the idea of improved residual updating across multiple SNPs. The alternative algorithms can be implemented for a range of different genomic predictions models, including random regression BLUP (best linear unbiased prediction) and most Bayesian genomic prediction models. To test the required computing time and RAM, both alternative algorithms were implemented in a Bayesian stochastic search variable selection model. Results: Compared to the original algorithm, the improved residual updating algorithm reduced CPU time by 35.3 to 43.3%, without changing memory requirements. The RHS-updating algorithm reduced CPU time by 74.5 to 93.0% and memory requirements by 13.1 to 66.4% compared to the original algorithm. Conclusions: The presented RHS-updating algorithm provides an interesting alternative to reduce both computing time and memory requirements for a range of genomic prediction models. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
141. Imputation of non-genotyped individuals based on genotyped relatives: assessing the imputation accuracy of a real case scenario in dairy cattle.
- Author
-
Bouwman, Aniek C., Hickey, John M., Calus, Mario P. L., and Veerkamp, Roel F.
- Subjects
DAIRY cattle genetics ,LIVESTOCK genetics ,SINGLE nucleotide polymorphisms ,LINKAGE disequilibrium ,HAPLOTYPES - Abstract
Background Imputation of genotypes for ungenotyped individuals could enable the use of valuable phenotypes created before the genomic era in analyses that require genotypes. The objective of this study was to investigate the accuracy of imputation of non-genotyped individuals using genotype information from relatives. Methods Genotypes were simulated for all individuals in the pedigree of a real (historical) dataset of phenotyped dairy cows and with part of the pedigree genotyped. The software AlphaImpute was used for imputation in its standard settings but also without phasing, i.e. using basic inheritance rules and segregation analysis only. Different scenarios were evaluated i.e.: (1) the real data scenario, (2) addition of genotypes of sires and maternal grandsires of the ungenotyped individuals, and (3) addition of one, two, or four genotyped offspring of the ungenotyped individuals to the reference population. Results The imputation accuracy using AlphaImpute in its standard settings was lower than without phasing. Including genotypes of sires and maternal grandsires in the reference population improved imputation accuracy, i.e. the correlation of the true genotypes with the imputed genotype dosages, corrected for mean gene content, across all animals increased from 0.47 (real situation) to 0.60. Including one, two and four genotyped offspring increased the accuracy of imputation across all animals from 0.57 (no offspring) to 0.73, 0.82, and 0.92, respectively. Conclusions At present, the use of basic inheritance rules and segregation analysis appears to be the best imputation method for ungenotyped individuals. Comparison of our empirical animal-specific imputation accuracies to predictions based on selection index theory suggested that not correcting for mean gene content considerably overestimates the true accuracy. Imputation of ungenotyped individuals can help to include valuable phenotypes for genome-wide association studies or for genomic prediction, especially when the ungenotyped individuals have genotyped offspring. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
142. Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures.
- Author
-
Bastiaansen, John W. M., Coster, Albart, Calus, Mario P. L., van Arendonk, Johan A. M., and Bovenhuis, Henk
- Subjects
GENOMICS ,GENETIC markers ,BAYESIAN analysis ,REGRESSION analysis ,DAIRY cattle breeding - Abstract
Background: Genomic selection has become an important tool in the genetic improvement of animals and plants. The objective of this study was to investigate the impacts of breeding value estimation method, reference population structure, and trait genetic architecture, on long-term response to genomic selection without updating marker effects. Methods: Three methods were used to estimate genomic breeding values: a BLUP method with relationships estimated from genome-wide markers (GBLUP), a Bayesian method, and a partial least squares regression method (PLSR). A shallow (individuals from one generation) or deep reference population (individuals from five generations) was used with each method. The effects of the different selection approaches were compared under four different genetic architectures for the trait under selection. Selection was based on one of the three genomic breeding values, on pedigree BLUP breeding values, or performed at random. Selection continued for ten generations. Results: Differences in long-term selection response were small. For a genetic architecture with a very small number of three to four quantitative trait loci (QTL), the Bayesian method achieved a response that was 0.05 to 0.1 genetic standard deviation higher than other methods in generation 10. For genetic architectures with approximately 30 to 300 QTL, PLSR (shallow reference) or GBLUP (deep reference) had an average advantage of 0.2 genetic standard deviation over the Bayesian method in generation 10. GBLUP resulted in 0.6% and 0.9% less inbreeding than PLSR and BM and on average a one third smaller reduction of genetic variance. Responses in early generations were greater with the shallow reference population while long-term response was not affected by reference population structure. Conclusions: The ranking of estimation methods was different with than without selection. Under selection, applying GBLUP led to lower inbreeding and a smaller reduction of genetic variance while a similar response to selection was achieved. The reference population structure had a limited effect on long-term accuracy and response. Use of a shallow reference population, most closely related to the selection candidates, gave early benefits while in later generations, when marker effects were not updated, the estimation of marker effects based on a deeper reference population did not pay off. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
143. Identification of Mendelian inconsistencies between SNP and pedigree information of sibs.
- Author
-
Calus, Mario P. L., Mulder, Han A., and Bastiaansen, John W. M.
- Subjects
ANIMAL genetics ,GENETIC polymorphisms ,ANIMAL genome mapping ,GENE expression ,GENETIC markers - Abstract
Background: Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods: Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAROFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results: Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Conclusions: Tests to remove Mendelian inconsistencies between sibs should be preceded by a test for parentoffspring inconsistencies. This parent-offspring test should not only consider parent-offspring pairs based on pedigree data, but also those based on SNP information. Both SIB tests could identify pairs of sibs with Mendelian inconsistencies. Based on type I and II error rates, counting opposing homozygotes between sibs (SIBCOUNT) appears slightly more precise than comparing genomic and pedigree relationships (SIBREL) to detect Mendelian inconsistencies between sibs. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
144. Accuracy of multi-trait genomic selection using different methods.
- Author
-
Calus, Mario P. L. and Veerkamp, Roel F.
- Subjects
GENOTYPE-environment interaction ,ANIMAL genetics ,GENETIC polymorphisms ,CHROMOSOME polymorphism ,GENE expression - Abstract
Background: Genomic selection has become a very important tool in animal genetics and is rapidly emerging in plant genetics. It holds the promise to be particularly beneficial to select for traits that are difficult or expensive to measure, such as traits that are measured in one environment and selected for in another environment. The objective of this paper was to develop three models that would permit multi-trait genomic selection by combining scarcely recorded traits with genetically correlated indicator traits, and to compare their performance to single-trait models, using simulated datasets. Methods: Three (SNP) Single Nucleotide Polymorphism based models were used. Model G and BCπ0 assumed that contributed (co)variances of all SNP are equal. Model BSSVS sampled SNP effects from a distribution with large (or small) effects to model SNP that are (or not) associated with a quantitative trait locus. For reasons of comparison, model A including pedigree but not SNP information was fitted as well. Results: In terms of accuracies for animals without phenotypes, the models generally ranked as follows: BSSVS > BCπ0 > G > > A. Using multi-trait SNP-based models, the accuracy for juvenile animals without any phenotypes increased up to 0.10. For animals with phenotypes on an indicator trait only, accuracy increased up to 0.03 and 0.14, for genetic correlations with the evaluated trait of 0.25 and 0.75, respectively. Conclusions: When the indicator trait had a genetic correlation lower than 0.5 with the trait of interest in our simulated data, the accuracy was higher if genotypes rather than phenotypes were obtained for the indicator trait. However, when genetic correlations were higher than 0.5, using an indicator trait led to higher accuracies for selection candidates. For different combinations of traits, the level of genetic correlation below which genotyping selection candidates is more effective than obtaining phenotypes for an indicator trait, needs to be derived considering at least the heritabilities and the numbers of animals recorded for the traits involved. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
145. Estimating genomic breeding values and detecting QTL using univariate and bivariate models.
- Author
-
Calus, Mario P. L., Mulder, Han A., and Veerkamp, Roel F.
- Subjects
- *
ANIMAL breeding , *GENERALIZED estimating equations , *PHENOTYPES , *MULTITRAIT multimethod techniques , *DEVELOPMENTAL stability (Genetics) - Abstract
Background: Genomic selection is particularly beneficial for difficult or expensive to measure traits. Since multi-trait selection is an important tool to deal with such cases, an important question is what the added value is of multitrait genomic selection. Methods: The simulated dataset, including a quantitative and binary trait, was analyzed with four univariate and bivariate linear models to predict breeding values for juvenile animals. Two models estimated variance components with REML using a numerator (A), or SNP based relationship matrix (G). Two SNP based Bayesian models included one (BayesA) or two distributions (BayesC) for estimated SNP effects. The bivariate BayesC model sampled QTL probabilities for each SNP conditional on both traits. Genotypes were permuted 2,000 times against phenotypes and pedigree, to obtain significance thresholds for posterior QTL probabilities. Genotypes were permuted rather than phenotypes, to retain relationships between pedigree and phenotypes, such that polygenic effects could still be estimated. Results: Correlations between estimated breeding values (EBV) of different SNP based models, for juvenile animals, were greater than 0.93 (0.87) for the quantitative (binary) trait. Estimated genetic correlation was 0.71 (0.66) for model G (A). Accuracies of breeding values of SNP based models were for both traits highest for BayesC and lowest for G. Accuracies of breeding values of bivariate models were up to 0.08 higher than for univariate models. The bivariate BayesC model detected 14 out of 32 QTL for the quantitative trait, and 8 out of 22 for the binary trait. Conclusions: Accuracy of EBV clearly improved for both traits using bivariate compared to univariate models. BayesC achieved highest accuracies of EBV and was also one of the methods that found most QTL. Permuting genotypes against phenotypes and pedigree in BayesC provided an effective way to derive significance thresholds for posterior QTL probabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
146. Estimating genetic diversity across the neutral genome with the use of dense marker maps.
- Author
-
Engelsma, Krista A., Calus, Mario P. L., Bijma, Piter, and Windig, Jack J.
- Subjects
GENOMICS ,GENETIC markers ,GENE frequency ,HETEROZYGOSITY ,HERITABILITY - Abstract
Background: With the advent of high throughput DNA typing, dense marker maps have become available to investigate genetic diversity on specific regions of the genome. The aim of this paper was to compare two marker based estimates of the genetic diversity in specific genomic regions lying in between markers: IBD-based genetic diversity and heterozygosity. Methods: A computer simulated population was set up with individuals containing a single 1-Morgan chromosome and 1665 SNP markers and from this one, an additional population was produced with a lower marker density i.e. 166 SNP markers. For each marker interval based on adjacent markers, the genetic diversity was estimated either by IBD probabilities or heterozygosity. Estimates were compared to each other and to the true genetic diversity. The latter was calculated for a marker in the middle of each marker interval that was not used to estimate genetic diversity. Results: The simulated population had an average minor allele frequency of 0.28 and an LD (r
2 ) of 0.26, comparable to those of real livestock populations. Genetic diversities estimated by IBD probabilities and by heterozygosity were positively correlated, and correlations with the true genetic diversity were quite similar for the simulated population with a high marker density, both for specific regions (r = 0.19-0.20) and large regions (r = 0.61-0.64) over the genome. For the population with a lower marker density, the correlation with the true genetic diversity turned out to be higher for the IBD-based genetic diversity. Conclusions: Genetic diversities of ungenotyped regions of the genome (i.e. between markers) estimated by IBDbased methods and heterozygosity give similar results for the simulated population with a high marker density. However, for a population with a lower marker density, the IBD-based method gives a better prediction, since variation and recombination between markers are missed with heterozygosity. [ABSTRACT FROM AUTHOR]- Published
- 2010
- Full Text
- View/download PDF
147. Prediction of haplotypes for ungenotyped animals and its effect on marker-assisted breeding value estimation.
- Author
-
Mulder, Han A., Calus, Mario P. L., and Veerkamp, Roel F.
- Subjects
HAPLOTYPES ,LIVESTOCK breeding ,HERITABILITY ,GENETIC polymorphisms ,BIOMARKERS - Abstract
Background: In livestock populations, missing genotypes on a large proportion of animals are a major problem to implement the estimation of marker-assisted breeding values using haplotypes. The objective of this article is to develop a method to predict haplotypes of animals that are not genotyped using mixed model equations and to investigate the effect of using these predicted haplotypes on the accuracy of marker-assisted breeding value estimation. Methods: For genotyped animals, haplotypes were determined and for each animal the number of haplotype copies (nhc) was counted, i.e. 0, 1 or 2 copies. In a mixed model framework, nhc for each haplotype were predicted for ungenotyped animals as well as for genotyped animals using the additive genetic relationship matrix. The heritability of nhc was assumed to be 0.99, allowing for minor genotyping and haplotyping errors. The predicted nhc were subsequently used in marker-assisted breeding value estimation by applying random regression on these covariables. To evaluate the method, a population was simulated with one additive QTL and an additive polygenic genetic effect. The QTL was located in the middle of a haplotype based on SNP-markers. Results: The accuracy of predicted haplotype copies for ungenotyped animals ranged between 0.59 and 0.64 depending on haplotype length. Because powerful BLUP-software was used, the method was computationally very efficient. The accuracy of total EBV increased for genotyped animals when marker-assisted breeding value estimation was compared with conventional breeding value estimation, but for ungenotyped animals the increase was marginal unless the heritability was smaller than 0.1. Haplotypes based on four markers yielded the highest accuracies and when only the nearest left marker was used, it yielded the lowest accuracy. The accuracy increased with increasing marker density. Accuracy of the total EBV approached that of gene-assisted BLUP when 4-marker haplotypes were used with a distance of 0.1 cM between the markers. Conclusions: The proposed method is computationally very efficient and suitable for marker-assisted breeding value estimation in large livestock populations including effects of a number of known QTL. Marker-assisted breeding value estimation using predicted haplotypes increases accuracy especially for traits with low heritability. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
148. Sensitivity of methods for estimating breeding values using genetic markers to the number of QTL and distribution of QTL variance.
- Author
-
Coster, Albart, Bastiaansen, John W. M., Calus, Mario P. L., van Arendonk, Johan A. M., and Bovenhuis, Henk
- Subjects
GENETIC markers ,HERITABILITY ,BAYESIAN analysis ,BREEDING ,GENETIC regulation - Abstract
The objective of this simulation study was to compare the effect of the number of QTL and distribution of QTL variance on the accuracy of breeding values estimated with genomewide markers (MEBV). Three distinct methods were used to calculate MEBV: a Bayesian Method (BM), Least Angle Regression (LARS) and Partial Least Square Regression (PLSR). The accuracy of MEBV calculated with BM and LARS decreased when the number of simulated QTL increased. The accuracy decreased more when QTL had different variance values than when all QTL had an equal variance. The accuracy of MEBV calculated with PLSR was affected neither by the number of QTL nor by the distribution of QTL variance. Additional simulations and analyses showed that these conclusions were not affected by the number of individuals in the training population, by the number of markers and by the heritability of the trait. Results of this study show that the effect of the number of QTL and distribution of QTL variance on the accuracy of MEBV depends on the method that is used to calculate MEBV. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
149. Estimation of prediction error variances via Monte Carlo sampling methods using different formulations of the prediction error variance.
- Author
-
Hickey, John M., Veerkamp, Roel F., Calus, Mario P. L., Mulder, Han A., and Thompson, Robin
- Subjects
ESTIMATION theory ,MONTE Carlo method ,COVARIANCE matrices ,ANIMAL breeding ,ALGORITHMS ,CONVERGENT evolution - Abstract
Calculation of the exact prediction error variance covariance matrix is often computationally too demanding, which limits its application in REML algorithms, the calculation of accuracies of estimated breeding values and the control of variance of response to selection. Alternatively Monte Carlo sampling can be used to calculate approximations of the prediction error variance, which converge to the true values if enough samples are used. However, in practical situations the number of samples, which are computationally feasible, is limited. The objective of this study was to compare the convergence rate of different formulations of the prediction error variance calculated using Monte Carlo sampling. Four of these formulations were published, four were corresponding alternative versions, and two were derived as part of this study. The different formulations had different convergence rates and these were shown to depend on the number of samples and on the level of prediction error variance. Four formulations were competitive and these made use of information on either the variance of the estimated breeding value and on the variance of the true breeding value minus the estimated breeding value or on the covariance between the true and estimated breeding values. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
150. Estimating genomic breeding values from the QTL-MAS Workshop Data using a single SNP and haplotype/IBD approach.
- Author
-
Calus, Mario P. L., de Roos, Sander P. W., and Veerkamp, Roel F.
- Subjects
- *
BREEDING , *GENOMES , *BAYESIAN analysis , *MATRICES (Mathematics) , *REGRESSION analysis , *MONOGENIC & polygenic inheritance (Genetics) , *PROBABILITY theory , *PHENOTYPES , *GENETIC research - Abstract
Genomic breeding values were estimated using a Gibbs sampler that avoided the use of the Metropolis-Hastings step as implemented in the BayesB model of Meuwissen et al., Genetics 2001, 157:1819-1829. Two models that estimated genomic estimated breeding values (EBVs) were applied: one used constructed haplotypes (based on alleles of 20 markers) and IBD matrices, another used single SNP regression. Both models were applied with or without polygenic effect. A fifth model included only polygenic effects and no genomic information. The models needed to estimate 366,959 effects for the haplotype/IBD approach, but only 11,850 effects for the single SNP approach. The four genomic models identified 11 to 14 regions that had a posterior QTL probability >0.1. Accuracies of genomic selection breeding values for animals in generations 4-6 ranged from 0.84 to 0.87 (haplotype/IBD vs. SNP). It can be concluded that including a polygenic effect in the genomic model had no effect on the accuracy of the total EBVs or prediction of the QTL positions. The SNP model yielded slightly higher accuracies for the total EBVs, while both models were able to detect nearly all QTL that explained at least 0.5% of the total phenotypic variance. [ABSTRACT FROM AUTHOR]
- Published
- 2009
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.