37 results on '"Goddard, Michael"'
Search Results
2. In it for the long run: perspectives on exploiting long-read sequencing in livestock for population scale studies of structural variants
- Author
-
Nguyen, Tuan V., Vander Jagt, Christy J., Wang, Jianghui, Daetwyler, Hans D., Xiang, Ruidong, Goddard, Michael E., Nguyen, Loan T., Ross, Elizabeth M., Hayes, Ben J., Chamberlain, Amanda J., and MacLeod, Iona M.
- Published
- 2023
- Full Text
- View/download PDF
3. Allele-specific binding variants causing ChIP-seq peak height of histone modification are not enriched in expression QTL annotations
- Author
-
Ghoreishifar, Mohammad, Chamberlain, Amanda J., Xiang, Ruidong, Prowse-Wilkins, Claire P., Lopdell, Thomas J., Littlejohn, Mathew D., Pryce, Jennie E., and Goddard, Michael E.
- Published
- 2024
- Full Text
- View/download PDF
4. A common regulatory haplotype doubles lactoferrin concentration in milk
- Author
-
Lopdell, Thomas J., Trevarton, Alexander J., Moody, Janelle, Prowse-Wilkins, Claire, Knowles, Sarah, Tiplady, Kathryn, Chamberlain, Amanda J., Goddard, Michael E., Spelman, Richard J., Lehnert, Klaus, Snell, Russell G., Davis, Stephen R., and Littlejohn, Mathew D.
- Published
- 2024
- Full Text
- View/download PDF
5. Sharing of either phenotypes or genetic variants can increase the accuracy of genomic prediction of feed efficiency
- Author
-
Bolormaa, Sunduimijid, MacLeod, Iona M., Khansefid, Majid, Marett, Leah C., Wales, William J., Miglior, Filippo, Baes, Christine F., Schenkel, Flavio S., Connor, Erin E., Manzanilla-Pech, Coralia I. V., Stothard, Paul, Herman, Emily, Nieuwhof, Gert J., Goddard, Michael E., and Pryce, Jennie E.
- Published
- 2022
- Full Text
- View/download PDF
6. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle
- Author
-
Prowse-Wilkins, Claire P., Lopdell, Thomas J., Xiang, Ruidong, Vander Jagt, Christy J., Littlejohn, Mathew D., Chamberlain, Amanda J., and Goddard, Michael E.
- Published
- 2022
- Full Text
- View/download PDF
7. Comparing allele specific expression and local expression quantitative trait loci and the influence of gene expression on complex trait variation in cattle
- Author
-
Khansefid, Majid, Pryce, Jennie E., Bolormaa, Sunduimijid, Chen, Yizhou, Millen, Catriona A., Chamberlain, Amanda J., Vander Jagt, Christy J., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
8. Genome variants associated with RNA splicing variations in bovine are extensively shared between tissues
- Author
-
Xiang, Ruidong, Hayes, Ben J., Vander Jagt, Christy J., MacLeod, Iona M., Khansefid, Majid, Bowman, Phil J., Yuan, Zehu, Prowse-Wilkins, Claire P., Reich, Coralie M., Mason, Brett A., Garner, Josie B., Marett, Leah C., Chen, Yizhou, Bolormaa, Sunduimijid, Daetwyler, Hans D., Chamberlain, Amanda J., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
9. A multi-trait Bayesian method for mapping QTL and genomic prediction
- Author
-
Kemper, Kathryn E., Bowman, Philip J., Hayes, Benjamin J., Visscher, Peter M., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
10. Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution.
- Author
-
Pausch, Hubert, Emmerling, Reiner, Gredler-Grandl, Birgit, Fries, Ruedi, Daetwyler, Hans D., and Goddard, Michael E.
- Subjects
NUCLEOTIDES ,NUCLEIC acids ,DATA analysis ,CATTLE breeds ,LIVESTOCK breeds - Abstract
Background: Genotyping and whole-genome sequencing data have been generated for hundreds of thousands of cattle. International consortia used these data to compile imputation reference panels that facilitate the imputation of sequence variant genotypes for animals that have been genotyped using dense microarrays. Association studies with imputed sequence variant genotypes allow for the characterization of quantitative trait loci (QTL) at nucleotide resolution particularly when individuals from several breeds are included in the mapping populations. Results: We imputed genotypes for 28 million sequence variants in 17,229 cattle of the Braunvieh, Fleckvieh and Holstein breeds in order to compile large mapping populations that provide high power to identify QTL for milk production traits. Association tests between imputed sequence variant genotypes and fat and protein percentages in milk uncovered between six and thirteen QTL (P < 1e-8) per breed. Eight of the detected QTL were significant in more than one breed. We combined the results across breeds using meta-analysis and identified a total of 25 QTL including six that were not significant in the within-breed association studies. Two missense mutations in the ABCG2 (p.Y581S, rs43702337, P = 4.3e-34) and GHR (p.F279Y, rs385640152, P = 1.6e-74) genes were the top variants at QTL on chromosomes 6 and 20. Another known causal missense mutation in the DGAT1 gene (p.A232K, rs109326954, P = 8.4e-1436) was the second top variant at a QTL on chromosome 14 but its allelic substitution effects were inconsistent across breeds. It turned out that the conflicting allelic substitution effects resulted from flaws in the imputed genotypes due to the use of a multi-breed reference population for genotype imputation. Conclusions: Many QTL for milk production traits segregate across breeds and across-breed meta-analysis has greater power to detect such QTL than within-breed association testing. Association testing between imputed sequence variant genotypes and phenotypes of interest facilitates identifying causal mutations provided the accuracy of imputation is high. However, true causal mutations may remain undetected when the imputed sequence variant genotypes contain flaws. It is highly recommended to validate the effect of known causal variants in order to assess the ability to detect true causal mutations in association studies with imputed sequence variants. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.
- Author
-
Tingting Wang, Chen, Yi-Ping Phoebe, MacLeod, Iona M., Pryce, Jennie E., Goddard, Michael E., and Hayes, Ben J.
- Subjects
NUCLEOTIDE sequencing ,EXPECTATION-maximization algorithms ,MARKOV chain Monte Carlo ,CATTLE population genetics ,MILK yield ,CATTLE fertility - Abstract
Background: Using whole genome sequence data might improve genomic prediction accuracy, when compared with high-density SNP arrays, and could lead to identification of casual mutations affecting complex traits. For some traits, the most accurate genomic predictions are achieved with non-linear Bayesian methods. However, as the number of variants and the size of the reference population increase, the computational time required to implement these Bayesian methods (typically with Monte Carlo Markov Chain sampling) becomes unfeasibly long. Results: Here, we applied a new method, HyB_BR (for Hybrid BayesR), which implements a mixture model of normal distributions and hybridizes an Expectation-Maximization (EM) algorithm followed by Markov Chain Monte Carlo (MCMC) sampling, to genomic prediction in a large dairy cattle population with imputed whole genome sequence data. The imputed whole genome sequence data included 994,019 variant genotypes of 16,214 Holstein and Jersey bulls and cows. Traits included fat yield, milk volume, protein kg, fat% and protein% in milk, as well as fertility and heat tolerance. HyB_BR achieved genomic prediction accuracies as high as the full MCMC implementation of BayesR, both for predicting a validation set of Holstein and Jersey bulls (multi-breed prediction) and a validation set of Australian Red bulls (across-breed prediction). HyB_BR had a ten fold reduction in compute time, compared with the MCMC implementation of BayesR (48 hours versus 594 hours). We also demonstrate that in many cases HyB_BR identified sequence variants with a high posterior probability of affecting the milk production or fertility traits that were similar to those identified in BayesR. For heat tolerance, both HyB_BR and BayesR found variants in or close to promising candidate genes associated with this trait and not detected by previous studies. Conclusions: The results demonstrate that HyB_BR is a feasible method for simultaneous genomic prediction and QTL mapping with whole genome sequence in large reference populations. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
12. Multiple-trait QTL mapping and genomic prediction for wool traits in sheep.
- Author
-
Bolormaa, Sunduimijid, Swan, Andrew A., Brown, Daniel J., Hatcher, Sue, Moghaddar, Nasir, van der Werf, Julius H., Goddard, Michael E., and Daetwyler, Hans D.
- Subjects
SHEEP genetics ,GENOMES ,SHEEP breeding - Abstract
Background: The application of genomic selection to sheep breeding could lead to substantial increases in profitability of wool production due to the availability of accurate breeding values from single nucleotide polymorphism (SNP) data. Several key traits determine the value of wool and influence a sheep's susceptibility to fleece rot and fly strike. Our aim was to predict genomic estimated breeding values (GEBV) and to compare three methods of combining information across traits to map polymorphisms that affect these traits. Methods: GEBV for 5726 Merino and Merino crossbred sheep were calculated using BayesR and genomic best linear unbiased prediction (GBLUP) with real and imputed 510,174 SNPs for 22 traits (at yearling and adult ages) including wool production and quality, and breech conformation traits that are associated with susceptibility to fly strike. Accuracies of these GEBV were assessed using fivefold cross-validation. We also devised and compared three approximate multi-trait analyses to map pleiotropic quantitative trait loci (QTL): a multi-trait genome-wide association study and two multi-trait methods that use the output from BayesR analyses. One BayesR method used local GEBV for each trait, while the other used the posterior probabilities that a SNP had an effect on each trait. Results: BayesR and GBLUP resulted in similar average GEBV accuracies across traits (~0.22). BayesR accuracies were highest for wool yield and fibre diameter (>0.40) and lowest for skin quality and dag score (<0.10). Generally, accuracy was higher for traits with larger reference populations and higher heritability. In total, the three multi-trait analyses identified 206 putative QTL, of which 20 were common to the three analyses. The two BayesR multi-trait approaches mapped QTL in a more defined manner than the multi-trait GWAS. We identified genes with known effects on hair growth (i.e. FGF5, STAT3, KRT86, and ALX4) near SNPs with pleiotropic effects on wool traits. Conclusions: The mean accuracy of genomic prediction across wool traits was around 0.22. The three multi-trait analyses identified 206 putative QTL across the ovine genome. Detailed phenotypic information helped to identify likely candidate genes. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
13. Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle.
- Author
-
Pausch, Hubert, MacLeod, Iona M., Fries, Ruedi, Emmerling, Reiner, Bowman, Phil J., Daetwyler, Hans D., and Goddard, Michael E.
- Subjects
CATTLE locomotion ,ANIMAL locomotion ,GENOTYPES ,NUCLEOTIDE sequencing ,HEMOGLOBIN polymorphisms - Abstract
Background: The availability of dense genotypes and whole-genome sequence variants from various sources offers the opportunity to compile large datasets consisting of tens of thousands of individuals with genotypes at millions of polymorphic sites that may enhance the power of genomic analyses. The imputation of missing genotypes ensures that all individuals have genotypes for a shared set of variants. Results: We evaluated the accuracy of imputation from dense genotypes to whole-genome sequence variants in 249 Fleckvieh and 450 Holstein cattle using Minimac and FImpute. The sequence variants of a subset of the animals were reduced to the variants that were included on the Illumina BovineHD genotyping array and subsequently inferred in silico using either within- or multi-breed reference populations. The accuracy of imputation varied considerably across chromosomes and dropped at regions where the bovine genome contains segmental duplications. Depending on the imputation strategy, the correlation between imputed and true genotypes ranged from 0.898 to 0.952. The accuracy of imputation was higher with Minimac than FImpute particularly for variants with a low minor allele frequency. Using a multi-breed reference population increased the accuracy of imputation, particularly when FImpute was used to infer genotypes. When the sequence variants were imputed using Minimac, the true genotypes were more correlated to predicted allele dosages than best-guess genotypes. The computing costs to impute 23,256,743 sequence variants in 6958 animals were ten-fold higher with Minimac than FImpute. Association studies with imputed sequence variants revealed seven quantitative trait loci (QTL) for milk fat percentage. Two causal mutations in the DGAT1 and GHR genes were the most significantly associated variants at two QTL on chromosomes 14 and 20 when Minimac was used to infer genotypes. Conclusions: The population-based imputation of millions of sequence variants in large cohorts is computationally feasible and provides accurate genotypes. However, the accuracy of imputation is low in regions where the genome contains large segmental duplications or the coverage with array-derived single nucleotide polymorphisms is poor. Using a reference population that includes individuals from many breeds increases the accuracy of imputation particularly at low-frequency variants. Considering allele dosages rather than best-guess genotypes as explanatory variables is advantageous to detect causal mutations in association studies with imputed sequence variants. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
14. A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.
- Author
-
Wang, Tingting, Chen, Yi-Ping Phoebe, Bowman, Phil J., Goddard, Michael E., and Hayes, Ben J.
- Subjects
GENOMICS ,GENE mapping ,MARKOV chain Monte Carlo ,PREDICTION models ,EXPECTATION-maximization algorithms - Abstract
Background: Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. Results: To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. Conclusions: The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
15. Copy number variants in the sheep genome detected using multiple approaches.
- Author
-
Jenkins, Gemma M., Goddard, Michael E., Black, Michael A., Brauning, Rudiger, Auvray, Benoit, Dodds, Ken G., Kijas, James W., Cockett, Noelle, and McEwan, John C.
- Subjects
- *
DNA copy number variations , *SHEEP genetics , *POLYMORPHISM (Zoology) , *NUCLEOTIDE sequencing , *GENOMICS - Abstract
Background: Copy number variants (CNVs) are a type of polymorphism found to underlie phenotypic variation, both in humans and livestock. Most surveys of CNV in livestock have been conducted in the cattle genome, and often utilise only a single approach for the detection of copy number differences. Here we performed a study of CNV in sheep, using multiple methods to identify and characterise copy number changes. Comprehensive information from small pedigrees (trios) was collected using multiple platforms (array CGH, SNP chip and whole genome sequence data), with these data then analysed via multiple approaches to identify and verify CNVs. Results: In total, 3,488 autosomal CNV regions (CNVRs) were identified in this study, which substantially builds on an initial survey of the sheep genome that identified 135 CNVRs. The average length of the identified CNVRs was 19 kb (range of 1 kb to 3.6 Mb), with shorter CNVRs being more frequent than longer CNVRs. The total length of all CNVRs was 67.6Mbps, which equates to 2.7 % of the sheep autosomes. For individuals this value ranged from 0.24 to 0.55 %, and the majority of CNVRs were identified in single animals. Rather than being uniformly distributed throughout the genome, CNVRs tended to be clustered. Application of three independent approaches for CNVR detection facilitated a comparison of validation rates. CNVs identified on the Roche-NimbleGen 2.1M CGH array generally had low validation rates with lower density arrays, while whole genome sequence data had the highest validation rate (>60 %). Conclusions: This study represents the first comprehensive survey of the distribution, prevalence and characteristics of CNVR in sheep. Multiple approaches were used to detect CNV regions and it appears that the best method for verifying CNVR on a large scale involves using a combination of detection methodologies. The characteristics of the 3,488 autosomal CNV regions identified in this study are comparable to other CNV regions reported in the literature and provide a valuable and sizeable addition to the small subset of published sheep CNVs. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
16. Detailed phenotyping identifies genes with pleiotropic effects on body composition.
- Author
-
Bolormaa, Sunduimijid, Hayes, Ben J., van der Werf, Julius H. J., Pethick, David, Goddard, Michael E., and Daetwyler, Hans D.
- Subjects
BODY composition ,HUMAN genetic variation ,GENETIC pleiotropy ,GLYCOGEN synthases ,PHENOTYPES - Abstract
Background: Genetic variation in both the composition and distribution of fat and muscle in the body is important to human health as well as the healthiness and value of meat from cattle and sheep. Here we use detailed phenotyping and a multi-trait approach to identify genes explaining variation in body composition traits. Results: A multi-trait genome wide association analysis of 56 carcass composition traits measured on 10,613 sheep with imputed and real genotypes on 510,174 SNPs was performed. We clustered 71 significant SNPs into five groups based on their pleiotropic effects across the 56 traits. Among these 71 significant SNPs, one group of 11 SNPs affected the fatty acid profile of themuscle and were close to 8 genes involved in fatty acid or triglyceride synthesis. Another group of 23 SNPs had an effect on mature size, based on their pattern of effects across traits, but the genes near this group of SNPs did not share any obvious function. Many of the likely candidate genes near SNPs with significant pleiotropic effects on the 56 traits are involved in intra-cellular signalling pathways. Among the significant SNPs were some with a convincing candidate gene due to the function of the gene (e.g. glycogen synthase affecting glycogen concentration) or because the same gene was associated with similar traits in other species. Conclusions: Using a multi-trait analysis increased the power to detect associations between SNP and body composition traits compared with the single trait analyses. Detailed phenotypic information helped to identify a convincing candidate in some cases as did information from other species. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. Extensive variation between tissues in allele specific expression in an outbred mammal.
- Author
-
Chamberlain, Amanda J., Vander Jagt, Christy J., Hayes, Benjamin J., Khansefid, Majid, Marett, Leah C., Millen, Catriona A., Nguyen, Thuy T. T., and Goddard, Michael E.
- Subjects
GENE expression in mammals ,ALLELES ,CHROMOSOMES ,RNA sequencing ,NUCLEOTIDE sequencing ,GENES - Abstract
Background: Allele specific gene expression (ASE), with the paternal allele more expressed than the maternal allele or vice versa, appears to be a common phenomenon in humans and mice. In other species the extent of ASE is unknown, and even in humans and mice there are several outstanding questions. These include; to what extent is ASE tissue specific? how often does the direction of allele expression imbalance reverse between tissues? how often is only one of the two alleles expressed? is there a genome wide bias towards expression of the paternal or maternal allele; and finally do genes that are nearby on a chromosome share the same direction of ASE? Here we use gene expression data (RNASeq) from 18 tissues from a single cow to investigate each of these questions in turn, and then validate some of these findings in two tissues from 20 cows. Results: Between 40 and 100 million sequence reads were generated per tissue across three replicate samples for each of the eighteen tissues from the single cow (the discovery dataset). A bovine gene expression atlas was created (the first from RNASeq data), and differentially expressed genes in each tissue were identified. To analyse ASE, we had access to unambiguously phased genotypes for all heterozygous variants in the cow's whole genome sequence, where these variants were homozygous in the whole genome sequence of her sire, and as a result we were able to map reads to parental genomes, to determine SNP and genes showing ASE in each tissue. In total 25,251 heterozygous SNP within 7985 genes were tested for ASE in at least one tissue. ASE was pervasive, 89 % of genes tested had significant ASE in at least one tissue. This large proportion of genes displaying ASE was confirmed in the two tissues in a validation dataset. For individual tissues the proportion of genes showing significant ASE varied from as low as 8-16 % of those tested in thymus to as high as 71-82 % of those tested in lung. There were a number of cases where the direction of allele expression imbalance reversed between tissues. For example the gene SPTY2D1 showed almost complete paternal allele expression in kidney and thymus, and almost complete maternal allele expression in the brain caudal lobe and brain cerebellum. Mono allelic expression (MAE) was common, with 1349 of 4856 genes (28 %) tested with more than one heterozygous SNP showing MAE. Across all tissues, 54.17 % of all genes with ASE favoured the paternal allele. Genes that are closely linked on the chromosome were more likely to show higher expression of the same allele (paternal or maternal) than expected by chance. We identified several long runs of neighbouring genes that showed either paternal or maternal ASE, one example was five adjacent genes (GIMAP8, GIMAP7 copy1, GIMAP4, GIMAP7 copy 2 and GIMAP5) that showed almost exclusive paternal expression in brain caudal lobe. Conclusions: Investigating the extent of ASE across 18 bovine tissues in one cow and two tissues in 20 cows demonstrated 1) ASE is pervasive in cattle, 2) the ASE is often MAE but ranges from MAE to slight overexpression of the major allele, 3) the ASE is most often tissue specific and that more than half the time displays divergent allele specific expression patterns across tissues, 4) across all genes there is a slight bias towards expression of the paternal allele and 5) genes expressing the same parental allele are clustered together more than expected by chance, and there are several runs of large numbers of genes expressing the same parental allele. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
18. Impact of QTL properties on the accuracy of multi-breed genomic prediction.
- Author
-
Wientjes, Yvonne C. J., Calus, Mario P. L., Goddard, Michael E., and Hayes, Ben J.
- Subjects
GENOMES ,GENETIC algorithms ,FORECASTING ,GENETICS ,ANIMAL genetics - Abstract
Background: Although simulation studies show that combining multiple breeds in one reference population increases accuracy of genomic prediction, this is not always confirmed in empirical studies. This discrepancy might be due to the assumptions on quantitative trait loci (QTL) properties applied in simulation studies, including number of QTL, spectrum of QTL allele frequencies across breeds, and distribution of allele substitution effects. We investigated the effects of QTL properties and of including a random across- and within-breed animal effect in a genomic best linear unbiased prediction (GBLUP) model on accuracy of multi-breed genomic prediction using genotypes of Holstein-Friesian and Jersey cows. Methods: Genotypes of three classes of variants obtained from whole-genome sequence data, with moderately low, very low or extremely low average minor allele frequencies (MAF), were imputed in 3000 Holstein-Friesian and 3000 Jersey cows that had real high-density genotypes. Phenotypes of traits controlled by QTL with different properties were simulated by sampling 100 or 1000 QTL from one class of variants and their allele substitution effects either randomly from a gamma distribution, or computed such that each QTL explained the same variance, i.e. rare alleles had a large effect. Genomic breeding values for 1000 selection candidates per breed were estimated using GBLUP models including a random across- and a within-breed animal effect. Results: For all three classes of QTL allele frequency spectra, accuracies of genomic prediction were not affected by the addition of 2000 individuals of the other breed to a reference population of the same breed as the selection candidates. Accuracies of both single- and multi-breed genomic prediction decreased as MAF of QTL decreased, especially when rare alleles had a large effect. Accuracies of genomic prediction were similar for the models with and without a random within-breed animal effect, probably because of insufficient power to separate across- and within-breed animal effects. Conclusions: Accuracy of both single- and multi-breed genomic prediction depends on the properties of the QTL that underlie the trait. As QTL MAF decreased, accuracy decreased, especially when rare alleles had a large effect. This demonstrates that QTL properties are key parameters that determine the accuracy of genomic prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
19. A computationally efficient algorithm for genomic prediction using a Bayesian model.
- Author
-
Tingting Wang, Yi-Ping Phoebe Chen, Goddard, Michael E., Meuwissen, Theo H. E., Kemper, Kathryn E., and Hayes, Ben J.
- Subjects
GENETICS ,GENOTYPES ,MARKOV processes ,DAIRY cattle ,BLOOD proteins ,INBREEDING ,CATTLE - Abstract
Background: Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation-maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time. Methods: emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction. Results: We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR. Conclusions: The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
20. Non-additive genetic variation in growth, carcass and fertility traits of beef cattle.
- Author
-
Bolormaa, Sunduimijid, Pryce, Jennie E., Yuandan Zhang, Reverter, Antonio, Barendse, William, Hayes, Ben J., and Goddard, Michael E.
- Subjects
CATTLE genetics ,BEEF cattle ,GENE expression ,SINGLE nucleotide polymorphisms ,EPISTASIS (Genetics) ,CATTLE - Abstract
Background: A better understanding of non-additive variance could lead to increased knowledge on the genetic control and physiology of quantitative traits, and to improved prediction of the genetic value and phenotype of individuals. Genome-wide panels of single nucleotide polymorphisms (SNPs) have been mainly used to map additive effects for quantitative traits, but they can also be used to investigate non-additive effects. We estimated dominance and epistatic effects of SNPs on various traits in beef cattle and the variance explained by dominance, and quantified the increase in accuracy of phenotype prediction by including dominance deviations in its estimation. Methods: Genotype data (729 068 real or imputed SNPs) and phenotypes on up to 16 traits of 10 191 individuals from Bos taurus, Bos indicus and composite breeds were used. A genome-wide association study was performed by fitting the additive and dominance effects of single SNPs. The dominance variance was estimated by fitting a dominance relationship matrix constructed from the 729 068 SNPs. The accuracy of predicted phenotypic values was evaluated by best linear unbiased prediction using the additive and dominance relationship matrices. Epistatic interactions (additive × additive) were tested between each of the 28 SNPs that are known to have additive effects on multiple traits, and each of the other remaining 729 067 SNPs. Results: The number of significant dominance effects was greater than expected by chance and most of them were in the direction that is presumed to increase fitness and in the opposite direction to inbreeding depression. Estimates of dominance variance explained by SNPs varied widely between traits, but had large standard errors. The median dominance variance across the 16 traits was equal to 5% of the phenotypic variance. Including a dominance deviation in the prediction did not significantly increase its accuracy for any of the phenotypes. The number of additive × additive epistatic effects that were statistically significant was greater than expected by chance. Conclusions: Significant dominance and epistatic effects occur for growth, carcass and fertility traits in beef cattle but they are difficult to estimate precisely and including them in phenotype prediction does not increase its accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
21. Improved precision of QTL mapping using a nonlinear Bayesian method in a multi-breed population leads to greater accuracy of across-breed genomic predictions.
- Author
-
Kemper, Kathryn E., Reich, Coralie M., Bowman, Philip J., vander Jagt, Christy J., Chamberlain, Amanda J., Mason, Brett A., Hayes, Benjamin J., and Goddard, Michael E.
- Subjects
DAIRY cattle genetics ,BAYESIAN analysis ,GENOMICS ,SINGLE nucleotide polymorphisms ,DAIRY cattle breeding ,ANIMAL population density ,GENE expression ,CATTLE - Abstract
Background: Genomic selection is increasingly widely practised, particularly in dairy cattle. However, the accuracy of current predictions using GBLUP (genomic best linear unbiased prediction) decays rapidly across generations, and also as selection candidates become less related to the reference population. This is likely caused by the effects of causative mutations being dispersed across many SNPs (single nucleotide polymorphisms) that span large genomic intervals. In this paper, we hypothesise that the use of a nonlinear method (BayesR), combined with a multi-breed (Holstein/Jersey) reference population will map causative mutations with more precision than GBLUP and this, in turn, will increase the accuracy of genomic predictions for selection candidates that are less related to the reference animals. Results: BayesR improved the across-breed prediction accuracy for Australian Red dairy cattle for five milk yield and composition traits by an average of 7% over the GBLUP approach (Australian Red animals were not included in the reference population). Using the multi-breed reference population with BayesR improved accuracy of prediction in Australian Red cattle by 2 - 5% compared to using BayesR with a single breed reference population. Inclusion of 8478 Holstein and 3917 Jersey cows in the reference population improved accuracy of predictions for these breeds by 4 and 5%. However, predictions for Holstein and Jersey cattle were similar using within-breed and multi-breed reference populations. We propose that the improvement in across-breed prediction achieved by BayesR with the multi-breed reference population is due to more precise mapping of quantitative trait loci (QTL), which was demonstrated for several regions. New candidate genes with functional links to milk synthesis were identified using differential gene expression in the mammary gland. Conclusions: QTL detection and genomic prediction are usually considered independently but persistence of genomic prediction accuracies across breeds requires accurate estimation of QTL effects. We show that accuracy of across-breed genomic predictions was higher with BayesR than with GBLUP and that BayesR mapped QTL more precisely. Further improvements of across-breed accuracy of genomic predictions and QTL mapping could be achieved by increasing the size of the reference population, including more breeds, and possibly by exploiting pleiotropic effects to improve mapping efficiency for QTL with small effects. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
22. Identification of genomic regions associated with inbreeding depression in Holstein and Jersey dairy cattle.
- Author
-
Pryce, Jennie E., Haile-Mariam, Mekonnen, Goddard, Michael E., and Hayes, Ben J.
- Subjects
DAIRY cattle breeding research ,HOLSTEIN-Friesian cattle ,JERSEY cattle ,INBREEDING ,SINGLE nucleotide polymorphisms ,ALLELES ,HOMOZYGOSITY - Abstract
Background: Inbreeding reduces the fitness of individuals by increasing the frequency of homozygous deleterious recessive alleles. Some insight into the genetic architecture of fitness, and other complex traits, can be gained by using single nucleotide polymorphism (SNP) data to identify regions of the genome which lead to reduction in performance when identical by descent (IBD). Here, we compared the effect of genome-wide and location-specific homozygosity on fertility and milk production traits in dairy cattle. Methods: Genotype data from more than 43 000 SNPs were available for 8853 Holstein and 4138 Jersey dairy cows that were part of a much larger dataset that had pedigree records (338 696 Holstein and 64 049 Jersey animals). Measures of inbreeding were based on: (1) pedigree data; (2) genotypes to determine the realised proportion of the genome that is IBD; (3) the proportion of the total genome that is homozygous and (4) runs of homozygosity (ROH) which are stretches of the genome that are homozygous. Results: A 1% increase in inbreeding based either on pedigree or genomic data was associated with a decrease in milk, fat and protein yields of around 0.4 to 0.6% of the phenotypic mean, and an increase in calving interval (i.e. a deterioration in fertility) of 0.02 to 0.05% of the phenotypic mean. A genome-wide association study using ROH of more than 50 SNPs revealed genomic regions that resulted in depression of up to 12.5 d and 260 L for calving interval and milk yield, respectively, when completely homozygous. Conclusions: Genomic measures can be used instead of pedigree-based inbreeding to estimate inbreeding depression. Both the diagonal elements of the genomic relationship matrix and the proportion of homozygous SNPs can be used to measure inbreeding. Longer ROH (>3 Mb) were found to be associated with a reduction in milk yield and captured recent inbreeding independently and in addition to overall homozygosity. Inbreeding depression can be reduced by minimizing overall inbreeding but maybe also by avoiding the production of offspring that are homozygous for deleterious alleles at specific genomic regions that are associated with inbreeding depression. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
23. Genetic variants in mammary development, prolactin signalling and involution pathways explain considerable variation in bovine milk production and milk composition.
- Author
-
Raven, Lesley-Ann, Cocks, Benjamin G., Goddard, Michael E., Pryce, Jennie E., and Hayes, Ben J.
- Subjects
MAMMALS ,PROLACTIN ,GENETIC polymorphisms ,SINGLE nucleotide polymorphisms ,MILK yield ,BOVINE anatomy - Abstract
Background: The maintenance of lactation in mammals is the result of a balance between competing signals from mammary development, prolactin signalling and involution pathways. Dairy cattle are an interesting case study to investigate the effect of polymorphisms that affect the function of genes in these pathways. In dairy cattle, lactation yields and milk composition (for example protein percentage and fat percentage) are routinely recorded, and these vary greatly between individuals. In this study, we test 8058 single nucleotide polymorphisms in or close to genes in these pathways for association with milk production traits and determine the proportion of variance explained by each pathway, using data on 16 812 dairy cattle, including Holstein-Friesian and Jersey bulls and cows. Results: Single nucleotide polymorphisms close to genes in the mammary development, prolactin signalling and involution pathways were significantly associated with milk production traits. The involution pathway explained the largest proportion of genetic variation for production traits. The mammary development pathway also explained additional genetic variation for milk volume, fat percentage and protein percentage. Conclusions: Genetic variants in the involution pathway explained considerably more genetic variation in milk production traits than expected by chance. Many of the associations for single nucleotide polymorphisms in genes in this pathway have not been detected in conventional genome-wide association studies. The pathway approach used here allowed us to identify some novel candidates for further studies that will be aimed at refining the location of associated genomic regions and identifying polymorphisms contributing to variation in lactation volume and milk composition. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
24. Selection for complex traits leaves little or no classic signatures of selection.
- Author
-
Kemper, Kathryn E., Saxton, Sarah J., Bolormaa, Sunduimijid, Hayes, Benjamin J., and Goddard, Michael E.
- Abstract
Background: Selection signatures aim to identify genomic regions underlying recent adaptations in populations. However, the effects of selection in the genome are difficult to distinguish from random processes, such as genetic drift. Often associations between selection signatures and selected variants for complex traits is assumed even though this is rarely (if ever) tested. In this paper, we use 8 breeds of domestic cattle under strong artificial selection to investigate if selection signatures are co-located in genomic regions which are likely to be under selection. Results: Our approaches to identify selection signatures (haplotype heterozygosity, integrated haplotype score and FST) identified strong and recent selection near many loci with mutations affecting simple traits under strong selection, such as coat colour. However, there was little evidence for a genome-wide association between strong selection signatures and regions affecting complex traits under selection, such as milk yield in dairy cattle. Even identifying selection signatures near some major loci was hindered by factors including allelic heterogeneity, selection for ancestral alleles and interactions with nearby selected loci. Conclusions: Selection signatures detect loci with large effects under strong selection. However, the methodology is often assumed to also detect loci affecting complex traits where the selection pressure at an individual locus is weak. We present empirical evidence to suggests little discernible ‘selection signature’ for complex traits in the genome of dairy cattle despite very strong and recent artificial selection. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
25. Adaptation of gastrointestinal nematode parasites to host genotype: single locus simulation models.
- Author
-
Kemper, Kathryn E., Goddard, Michael E., and Bishop, Stephen C.
- Subjects
PARASITES ,GASTROINTESTINAL system ,NEMATODES ,WORMS ,COMPUTER simulation - Abstract
Background: Breeding livestock for improved resistance to disease is an increasingly important selection goal. However, the risk of pathogens adapting to livestock bred for improved disease resistance is difficult to quantify. Here, we explore the possibility of gastrointestinal worms adapting to sheep bred for low faecal worm egg count using computer simulation. Our model assumes sheep and worm genotypes interact at a single locus, such that the effect of an A allele in sheep is dependent on worm genotype, and the B allele in worms is favourable for parasitizing the A allele sheep but may increase mortality on pasture. We describe the requirements for adaptation and test if worm adaptation (1) is slowed by non-genetic features of worm infections and (2) can occur with little observable change in faecal worm egg count. Results: Adaptation in worms was found to be primarily influenced by overall worm fitness, viz. the balance between the advantage of the B allele during the parasitic stage in sheep and its disadvantage on pasture. Genetic variation at the interacting locus in worms could be from de novo or segregating mutations, but de novo mutations are rare and segregating mutations are likely constrained to have (near) neutral effects on worm fitness. Most other aspects of the worm infection we modelled did not affect the outcomes. However, the host-controlled mechanism to reduce faecal worm egg count by lowering worm fecundity reduced the selection pressure on worms to adapt compared to other mechanisms, such as increasing worm mortality. Temporal changes in worm egg count were unreliable for detecting adaptation, despite the steady environment assumed in the simulations. Conclusions: Adaptation of worms to sheep selected for low faecal worm egg count requires an allele segregating in worms that is favourable in animals with improved resistance but less favourable in other animals. Obtaining alleles with this specific property seems unlikely. With support from experimental data, we conclude that selection for low faecal worm egg count should be stable over a short time frame (e.g. 20 years). We are further exploring model outcomes with multiple loci and comparing outcomes to other control strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
26. Genetic architecture of body size in mammals.
- Author
-
Kemper, Kathryn E., Visscher, Peter M., and Goddard, Michael E.
- Published
- 2012
- Full Text
- View/download PDF
27. Genome position specific priors for genomic prediction.
- Author
-
Br�ndum, Rasmus Froberg, Guosheng Su, Lund, Mogens Sand�, Bowman, Philip J., Goddard, Michael E., and Hayes, Benjamin J.
- Subjects
DAIRY cattle ,DAIRY farms ,MILK yield ,GENETIC mutation ,ALLELES - Abstract
Background: The accuracy of genomic prediction is highly dependent on the size of the reference population. For small populations, including information from other populations could improve this accuracy. The usual strategy is to pool data from different populations; however, this has not proven as successful as hoped for with distantly related breeds. BayesRS is a novel approach to share information across populations for genomic predictions. The approach allows information to be captured even where the phase of SNP alleles and casuative mutation alleles are reversed across populations, or the actual casuative mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed. Results: Results showed an increase in accuracy of up to 3.5% for the Jersey population when using BayesRS with a prior derived from Australian Holstein compared to a model without location specific priors. The increase in accuracy was however lower than was achieved when reference populations were combined to estimate SNP effects, except in the case of fat yield. The small size of the Jersey validation set meant that these improvements in accuracy were not significant using a Hotelling-Williams t-test at the 5% level. An increase in accuracy of 1-2% for all traits was observed in the Australian Holstein population when using a prior derived from the Nordic Holstein population compared to using no prior information. These improvements were significant (P<0.05) using the Hotelling Williams t-test for protein- and fat yield. Conclusion: For some traits the method might be advantageous compared to pooling of reference data for distantly related populations, but further investigation is needed to confirm the results. For closely related populations the method does not perform better than pooling reference data. However, it does give an increased accuracy compared to analysis based on only one reference population, without an increased computational burden. The approach described here provides a general setup for inclusion of location specific priors: the approach could be used to include biological information in genomic predictions. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
28. The use of communal rearing of families and DNA pooling in aquaculture genomic selection schemes.
- Author
-
Sonesson, Anna K., Meuwissen, Theo H. E., and Goddard, Michael E.
- Subjects
DNA ,GENOTYPE-environment interaction ,ECOLOGICAL genetics ,NATURAL immunity ,GENETIC testing - Abstract
Background: Traditional family-based aquaculture breeding programs, in which families are kept separately until individual tagging and most traits are measured on the sibs of the candidates, are costly and require a high level of reproductive control. The most widely used alternative is a selection scheme, where families are reared communally and the candidates are selected based on their own individual measurements of the traits under selection. However, in the latter selection schemes, inclusion of new traits depends on the availability of noninvasive techniques to measure the traits on selection candidates. This is a severe limitation of these schemes, especially for disease resistance and fillet quality traits. Methods: Here, we present a new selection scheme, which was validated using computer simulations comprising 100 families, among which 1, 10 or 100 were reared communally in groups. Pooling of the DNA from 2000, 20000 or 50000 test individuals with the highest and lowest phenotypes was used to estimate 500, 5000 or 10000 marker effects. One thousand or 2000 out of 20000 candidates were preselected for a growth-like trait. These pre-selected candidates were genotyped, and they were selected on their genome-wide breeding values for a trait that could not be measured on the candidates. Results: A high accuracy of selection, i.e. 0.60-0.88 was obtained with 20000-50000 test individuals but it was reduced when only 2000 test individuals were used. This shows the importance of having large numbers of phenotypic records to accurately estimate marker effects. The accuracy of selection decreased with increasing numbers of families per group. Conclusions: This new selection scheme combines communal rearing of families, pre-selection of candidates, DNA pooling and genomic selection and makes multi-trait selection possible in aquaculture selection schemes without keeping families separately until individual tagging is possible. The new scheme can also be used for other farmed species, for which the cost of genotyping test individuals may be high, e.g. if trait heritability is low. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
29. Sensitivity of genomic selection to using different prior distributions.
- Author
-
Verbyla, Klara L., Bowman, Philip J., Hayes, Ben J., and Goddard, Michael E.
- Subjects
GENOMICS ,GENETIC polymorphisms ,NUCLEOTIDES ,BAYESIAN analysis ,PHENOTYPES ,GENETIC markers - Abstract
Genomic selection describes a selection strategy based on genomic estimated breeding values (GEBV) predicted from dense genetic markers such as single nucleotide polymorphism (SNP) data. Different Bayesian models have been suggested to derive the prediction equation, with the main difference centred around the specification of the prior distributions. Methods: The simulated dataset of the 13th QTL-MAS workshop was analysed using four Bayesian approaches to predict GEBV for animals without phenotypic information. Different prior distributions were assumed to assess their affect on the accuracy of the predicted GEBV. Conclusion: All methods produced GEBV that were highly correlated with the true breeding values. The models appear relatively insensitive to the choice of prior distributions for QTL-MAS data set and this is consistent with uniformity of performance of different methods found in real data. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
30. Using the realized relationship matrix to disentangle confounding factors for the estimation of genetic variance components of complex traits.
- Author
-
Sang Hong Lee, Goddard, Michael E., Visscher, Peter M., and van der Werf, Julius H. J.
- Subjects
HUMAN genetic variation ,ANALYSIS of covariance ,GENETIC polymorphisms ,HETEROGENEITY ,NUCLEOTIDES ,GENOMICS - Abstract
Background: In the analysis of complex traits, genetic effects can be confounded with non-genetic effects, especially when using full-sib families. Dominance and epistatic effects are typically confounded with additive genetic and nongenetic effects. This confounding may cause the estimated genetic variance components to be inaccurate and biased. Methods: In this study, we constructed genetic covariance structures from whole-genome marker data, and thus used realized relationship matrices to estimate variance components in a heterogenous population of ~ 2200 mice for which four complex traits were investigated. These mice were genotyped for more than 10,000 single nucleotide polymorphisms (SNP) and the variances due to family, cage and genetic effects were estimated by models based on pedigree information only, aggregate SNP information, and model selection for specific SNP effects. Results and conclusions: We show that the use of genome-wide SNP information can disentangle confounding factors to estimate genetic variances by separating genetic and non-genetic effects. The estimated variance components using realized relationship were more accurate and less biased, compared to those based on pedigree information only. Models that allow the selection of individual SNP in addition to fitting a relationship matrix are more efficient for traits with a significant dominance variance. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
31. Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.
- Author
-
Wang T, Chen YP, MacLeod IM, Pryce JE, Goddard ME, and Hayes BJ
- Subjects
- Algorithms, Animals, Bayes Theorem, Cattle, Female, Fertility genetics, Genotype, Markov Chains, Milk metabolism, Monte Carlo Method, Phenotype, Polymorphism, Single Nucleotide, Chromosome Mapping, Genomics, Nonlinear Dynamics, Quantitative Trait Loci genetics, Whole Genome Sequencing
- Abstract
Background: Using whole genome sequence data might improve genomic prediction accuracy, when compared with high-density SNP arrays, and could lead to identification of casual mutations affecting complex traits. For some traits, the most accurate genomic predictions are achieved with non-linear Bayesian methods. However, as the number of variants and the size of the reference population increase, the computational time required to implement these Bayesian methods (typically with Monte Carlo Markov Chain sampling) becomes unfeasibly long., Results: Here, we applied a new method, HyB_BR (for Hybrid BayesR), which implements a mixture model of normal distributions and hybridizes an Expectation-Maximization (EM) algorithm followed by Markov Chain Monte Carlo (MCMC) sampling, to genomic prediction in a large dairy cattle population with imputed whole genome sequence data. The imputed whole genome sequence data included 994,019 variant genotypes of 16,214 Holstein and Jersey bulls and cows. Traits included fat yield, milk volume, protein kg, fat% and protein% in milk, as well as fertility and heat tolerance. HyB_BR achieved genomic prediction accuracies as high as the full MCMC implementation of BayesR, both for predicting a validation set of Holstein and Jersey bulls (multi-breed prediction) and a validation set of Australian Red bulls (across-breed prediction). HyB_BR had a ten fold reduction in compute time, compared with the MCMC implementation of BayesR (48 hours versus 594 hours). We also demonstrate that in many cases HyB_BR identified sequence variants with a high posterior probability of affecting the milk production or fertility traits that were similar to those identified in BayesR. For heat tolerance, both HyB_BR and BayesR found variants in or close to promising candidate genes associated with this trait and not detected by previous studies., Conclusions: The results demonstrate that HyB_BR is a feasible method for simultaneous genomic prediction and QTL mapping with whole genome sequence in large reference populations.
- Published
- 2017
- Full Text
- View/download PDF
32. A computationally efficient algorithm for genomic prediction using a Bayesian model.
- Author
-
Wang T, Chen YP, Goddard ME, Meuwissen TH, Kemper KE, and Hayes BJ
- Subjects
- Animals, Bayes Theorem, Cattle, Male, Models, Genetic, Models, Statistical, Polymorphism, Single Nucleotide, Algorithms, Breeding, Genomics methods
- Abstract
Background: Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation-maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time., Methods: emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction., Results: We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR., Conclusions: The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets.
- Published
- 2015
- Full Text
- View/download PDF
33. Genome position specific priors for genomic prediction.
- Author
-
Brøndum RF, Su G, Lund MS, Bowman PJ, Goddard ME, and Hayes BJ
- Subjects
- Alleles, Animals, Breeding, Cattle, Dietary Fats analysis, Female, Gene Expression, Gene Expression Profiling, Genotype, Milk chemistry, Mutation, Oligonucleotide Array Sequence Analysis, Phenotype, Polymorphism, Single Nucleotide, Proteins analysis, Sensitivity and Specificity, Genome, Models, Genetic, Quantitative Trait, Heritable
- Abstract
Background: The accuracy of genomic prediction is highly dependent on the size of the reference population. For small populations, including information from other populations could improve this accuracy. The usual strategy is to pool data from different populations; however, this has not proven as successful as hoped for with distantly related breeds. BayesRS is a novel approach to share information across populations for genomic predictions. The approach allows information to be captured even where the phase of SNP alleles and casuative mutation alleles are reversed across populations, or the actual casuative mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed., Results: Results showed an increase in accuracy of up to 3.5% for the Jersey population when using BayesRS with a prior derived from Australian Holstein compared to a model without location specific priors. The increase in accuracy was however lower than was achieved when reference populations were combined to estimate SNP effects, except in the case of fat yield. The small size of the Jersey validation set meant that these improvements in accuracy were not significant using a Hotelling-Williams t-test at the 5% level. An increase in accuracy of 1-2% for all traits was observed in the Australian Holstein population when using a prior derived from the Nordic Holstein population compared to using no prior information. These improvements were significant (P<0.05) using the Hotelling Williams t-test for protein- and fat yield., Conclusion: For some traits the method might be advantageous compared to pooling of reference data for distantly related populations, but further investigation is needed to confirm the results. For closely related populations the method does not perform better than pooling reference data. However, it does give an increased accuracy compared to analysis based on only one reference population, without an increased computational burden. The approach described here provides a general setup for inclusion of location specific priors: the approach could be used to include biological information in genomic predictions.
- Published
- 2012
- Full Text
- View/download PDF
34. Using the realized relationship matrix to disentangle confounding factors for the estimation of genetic variance components of complex traits.
- Author
-
Lee SH, Goddard ME, Visscher PM, and van der Werf JH
- Subjects
- Animals, Bias, Genome, Genotype, Mice, Models, Biological, Pedigree, Polymorphism, Single Nucleotide, Regression Analysis, Siblings, Genetic Variation genetics
- Abstract
Background: In the analysis of complex traits, genetic effects can be confounded with non-genetic effects, especially when using full-sib families. Dominance and epistatic effects are typically confounded with additive genetic and non-genetic effects. This confounding may cause the estimated genetic variance components to be inaccurate and biased., Methods: In this study, we constructed genetic covariance structures from whole-genome marker data, and thus used realized relationship matrices to estimate variance components in a heterogenous population of approximately 2200 mice for which four complex traits were investigated. These mice were genotyped for more than 10,000 single nucleotide polymorphisms (SNP) and the variances due to family, cage and genetic effects were estimated by models based on pedigree information only, aggregate SNP information, and model selection for specific SNP effects., Results and Conclusions: We show that the use of genome-wide SNP information can disentangle confounding factors to estimate genetic variances by separating genetic and non-genetic effects. The estimated variance components using realized relationship were more accurate and less biased, compared to those based on pedigree information only. Models that allow the selection of individual SNP in addition to fitting a relationship matrix are more efficient for traits with a significant dominance variance.
- Published
- 2010
- Full Text
- View/download PDF
35. Multi-locus models of genetic risk of disease.
- Author
-
Wray NR and Goddard ME
- Abstract
Background: Evidence for genetic contribution to complex diseases is described by recurrence risks to relatives of diseased individuals. Genome-wide association studies allow a description of the genetics of the same diseases in terms of risk loci, their effects and allele frequencies. To reconcile the two descriptions requires a model of how risks from individual loci combine to determine an individual's overall risk., Methods: We derive predictions of risk to relatives from risks at individual loci under a number of models and compare them with published data on disease risk., Results: The model in which risks are multiplicative on the risk scale implies equality between the recurrence risk to monozygotic twins and the square of the recurrence risk to sibs, a relationship often not observed, especially for low prevalence diseases. We show that this theoretical equality is achieved by allowing impossible probabilities of disease. Other models, in which probabilities of disease are constrained to a maximum of one, generate results more consistent with empirical estimates for a range of diseases., Conclusions: The unconstrained multiplicative model, often used in theoretical studies because of its mathematical tractability, is not a realistic model. We find three models, the constrained multiplicative, Odds (or Logit) and Probit (or liability threshold) models, all fit the data on risk to relatives. Currently, in practice it would be difficult to differentiate between these models, but this may become possible if genetic variants that explain the majority of the genetic variance are identified.
- Published
- 2010
- Full Text
- View/download PDF
36. Genes influencing milk production traits predominantly affect one of four biological pathways.
- Author
-
Chamberlain AJ, McPartlan HC, and Goddard ME
- Subjects
- Animals, Dairying, Efficiency, Female, Genetic Linkage, Models, Genetic, Lactation genetics, Milk, Quantitative Trait Loci, Signal Transduction genetics
- Abstract
In this study we introduce a method that accounts for false positive and false negative results in attempting to estimate the true proportion of quantitative trait loci that affect two different traits. This method was applied to data from a genome scan that was used to detect QTL for three independent milk production traits, Australian Selection Index (ASI), protein percentage (P%) and fat percentage corrected for protein percentage (F% - P%). These four different scenarios are attributed to four biological pathways: QTL that (1) increase or decrease total mammary gland production (affecting ASI only); (2) increase or decrease lactose synthesis resulting in the volume of milk being changed but without a change in protein or fat yield (affecting P% only); (3) increase or decrease protein synthesis while milk volume remains relatively constant (affecting ASI and P% in the same direction); (4) increase or decrease fat synthesis while the volume of milk remains relatively constant (affecting F% - P% only). The results indicate that of the positions that detected a gene, most affected one trait and not the others, though a small proportion (2.8%) affected ASI and P% in the same direction.
- Published
- 2008
- Full Text
- View/download PDF
37. Empirical evaluation of selective DNA pooling to map QTL in dairy cattle using a half-sib design by comparison to individual genotyping and interval mapping.
- Author
-
Mariasegaram M, Robinson NA, and Goddard ME
- Subjects
- Alleles, Animals, Cattle, Chromosomes, DNA blood, Female, Gene Frequency, Genetic Linkage, Genetic Markers, Genome, Linear Models, Models, Statistical, Pedigree, Chromosome Mapping, DNA analysis, Dairying, Genotype, Inbreeding, Quantitative Trait, Heritable
- Abstract
This study represents the first attempt at an empirical evaluation of the DNA pooling methodology by comparing it to individual genotyping and interval mapping to detect QTL in a dairy half-sib design. The findings indicated that the use of peak heights from the pool electropherograms without correction for stutter (shadow) product and preferential amplification performed as well as corrected estimates of frequencies. However, errors were found to decrease the power of the experiment at every stage of the pooling and analysis. The main sources of errors include technical errors from DNA quantification, pool construction, inconsistent differential amplification, and from the prevalence of sire alleles in the dams. Additionally, interval mapping using individual genotyping gains information from phenotypic differences between individuals in the same pool and from neighbouring markers, which is lost in a DNA pooling design. These errors cause some differences between the markers detected as significant by pooling and those found significant by interval mapping based on individual selective genotyping. Therefore, it is recommended that pooled genotyping only be used as part of an initial screen with significant results to be confirmed by individual genotyping. Strategies for improving the efficiency of the DNA pooling design are also presented.
- Published
- 2007
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.