24 results on '"Goddard, Michael"'
Search Results
2. In it for the long run: perspectives on exploiting long-read sequencing in livestock for population scale studies of structural variants
- Author
-
Nguyen, Tuan V., Vander Jagt, Christy J., Wang, Jianghui, Daetwyler, Hans D., Xiang, Ruidong, Goddard, Michael E., Nguyen, Loan T., Ross, Elizabeth M., Hayes, Ben J., Chamberlain, Amanda J., and MacLeod, Iona M.
- Published
- 2023
- Full Text
- View/download PDF
3. Sharing of either phenotypes or genetic variants can increase the accuracy of genomic prediction of feed efficiency
- Author
-
Bolormaa, Sunduimijid, MacLeod, Iona M., Khansefid, Majid, Marett, Leah C., Wales, William J., Miglior, Filippo, Baes, Christine F., Schenkel, Flavio S., Connor, Erin E., Manzanilla-Pech, Coralia I. V., Stothard, Paul, Herman, Emily, Nieuwhof, Gert J., Goddard, Michael E., and Pryce, Jennie E.
- Published
- 2022
- Full Text
- View/download PDF
4. Genetic variation in histone modifications and gene expression identifies regulatory variants in the mammary gland of cattle
- Author
-
Prowse-Wilkins, Claire P., Lopdell, Thomas J., Xiang, Ruidong, Vander Jagt, Christy J., Littlejohn, Mathew D., Chamberlain, Amanda J., and Goddard, Michael E.
- Published
- 2022
- Full Text
- View/download PDF
5. Allele-specific binding variants causing ChIP-seq peak height of histone modification are not enriched in expression QTL annotations.
- Author
-
Ghoreishifar, Mohammad, Chamberlain, Amanda J., Xiang, Ruidong, Prowse-Wilkins, Claire P., Lopdell, Thomas J., Littlejohn, Mathew D., Pryce, Jennie E., and Goddard, Michael E.
- Subjects
GENE expression ,LOCUS (Genetics) ,LINKAGE disequilibrium ,NUCLEOTIDE sequence ,IMMUNOPRECIPITATION ,SUPPORT vector machines - Abstract
Background: Genome sequence variants affecting complex traits (quantitative trait loci, QTL) are enriched in functional regions of the genome, such as those marked by certain histone modifications. These variants are believed to influence gene expression. However, due to the linkage disequilibrium among nearby variants, pinpointing the precise location of QTL is challenging. We aimed to identify allele-specific binding (ASB) QTL (asbQTL) that cause variation in the level of histone modification, as measured by the height of peaks assayed by ChIP-seq (chromatin immunoprecipitation sequencing). We identified DNA sequences that predict the difference between alleles in ChIP-seq peak height in H3K4me3 and H3K27ac histone modifications in the mammary glands of cows. Results: We used a gapped k-mer support vector machine, a novel best linear unbiased prediction model, and a multiple linear regression model that combines the other two approaches to predict variant impacts on peak height. For each method, a subset of 1000 sites with the highest magnitude of predicted ASB was considered as candidate asbQTL. The accuracy of this prediction was measured by the proportion where the predicted direction matched the observed direction. Prediction accuracy ranged between 0.59 and 0.74, suggesting that these 1000 sites are enriched for asbQTL. Using independent data, we investigated functional enrichment in the candidate asbQTL set and three control groups, including non-causal ASB sites, non-ASB variants under a peak, and SNPs (single nucleotide polymorphisms) not under a peak. For H3K4me3, a higher proportion of the candidate asbQTL were confirmed as ASB when compared to the non-causal ASB sites (P < 0.01). However, these candidate asbQTL did not enrich for the other annotations, including expression QTL (eQTL), allele-specific expression QTL (aseQTL) and sites conserved across mammals (P > 0.05). Conclusions: We identified putatively causal sites for asbQTL using the DNA sequence surrounding these sites. Our results suggest that many sites influencing histone modifications may not directly affect gene expression. However, it is important to acknowledge that distinguishing between putative causal ASB sites and other non-causal ASB sites in high linkage disequilibrium with the causal sites regarding their impact on gene expression may be challenging due to limitations in statistical power. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. A common regulatory haplotype doubles lactoferrin concentration in milk.
- Author
-
Lopdell, Thomas J., Trevarton, Alexander J., Moody, Janelle, Prowse-Wilkins, Claire, Knowles, Sarah, Tiplady, Kathryn, Chamberlain, Amanda J., Goddard, Michael E., Spelman, Richard J., Lehnert, Klaus, Snell, Russell G., Davis, Stephen R., and Littlejohn, Mathew D.
- Subjects
LACTOFERRIN ,HAPLOTYPES ,LOCUS (Genetics) ,WHEY proteins ,GENE mapping ,MILK - Abstract
Background: Bovine lactoferrin (Lf) is an iron absorbing whey protein with antibacterial, antiviral, and antifungal activity. Lactoferrin is economically valuable and has an extremely variable concentration in milk, partly driven by environmental influences such as milking frequency, involution, or mastitis. A significant genetic influence has also been previously observed to regulate lactoferrin content in milk. Here, we conducted genetic mapping of lactoferrin protein concentration in conjunction with RNA-seq, ChIP-seq, and ATAC-seq data to pinpoint candidate causative variants that regulate lactoferrin concentrations in milk. Results: We identified a highly-significant lactoferrin protein quantitative trait locus (pQTL), as well as a cislactotransferrin (LTF) expression QTL (cis-eQTL) mapping to the LTF locus. Using ChIP-seq and ATAC-seq datasets representing lactating mammary tissue samples, we also report a number of regions where the openness of chromatin is under genetic influence. Several of these also show highly significant QTL with genetic signatures similar to those highlighted through pQTL and eQTL analysis. By performing correlation analysis between these QTL, we revealed an ATAC-seq peak in the putative promotor region of LTF, that highlights a set of 115 high-frequency variants that are potentially responsible for these effects. One of the 115 variants (rs110000337), which maps within the ATAC-seq peak, was predicted to alter binding sites of transcription factors known to be involved in lactation-related pathways. Conclusions: Here, we report a regulatory haplotype of 115 variants with conspicuously large impacts on milk lactoferrin concentration. These findings could enable the selection of animals for high-producing specialist herds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Comparing allele specific expression and local expression quantitative trait loci and the influence of gene expression on complex trait variation in cattle
- Author
-
Khansefid, Majid, Pryce, Jennie E., Bolormaa, Sunduimijid, Chen, Yizhou, Millen, Catriona A., Chamberlain, Amanda J., Vander Jagt, Christy J., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
8. Genome variants associated with RNA splicing variations in bovine are extensively shared between tissues
- Author
-
Xiang, Ruidong, Hayes, Ben J., Vander Jagt, Christy J., MacLeod, Iona M., Khansefid, Majid, Bowman, Phil J., Yuan, Zehu, Prowse-Wilkins, Claire P., Reich, Coralie M., Mason, Brett A., Garner, Josie B., Marett, Leah C., Chen, Yizhou, Bolormaa, Sunduimijid, Daetwyler, Hans D., Chamberlain, Amanda J., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
9. A multi-trait Bayesian method for mapping QTL and genomic prediction
- Author
-
Kemper, Kathryn E., Bowman, Philip J., Hayes, Benjamin J., Visscher, Peter M., and Goddard, Michael E.
- Published
- 2018
- Full Text
- View/download PDF
10. Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution.
- Author
-
Pausch, Hubert, Emmerling, Reiner, Gredler-Grandl, Birgit, Fries, Ruedi, Daetwyler, Hans D., and Goddard, Michael E.
- Subjects
NUCLEOTIDES ,NUCLEIC acids ,DATA analysis ,CATTLE breeds ,LIVESTOCK breeds - Abstract
Background: Genotyping and whole-genome sequencing data have been generated for hundreds of thousands of cattle. International consortia used these data to compile imputation reference panels that facilitate the imputation of sequence variant genotypes for animals that have been genotyped using dense microarrays. Association studies with imputed sequence variant genotypes allow for the characterization of quantitative trait loci (QTL) at nucleotide resolution particularly when individuals from several breeds are included in the mapping populations. Results: We imputed genotypes for 28 million sequence variants in 17,229 cattle of the Braunvieh, Fleckvieh and Holstein breeds in order to compile large mapping populations that provide high power to identify QTL for milk production traits. Association tests between imputed sequence variant genotypes and fat and protein percentages in milk uncovered between six and thirteen QTL (P < 1e-8) per breed. Eight of the detected QTL were significant in more than one breed. We combined the results across breeds using meta-analysis and identified a total of 25 QTL including six that were not significant in the within-breed association studies. Two missense mutations in the ABCG2 (p.Y581S, rs43702337, P = 4.3e-34) and GHR (p.F279Y, rs385640152, P = 1.6e-74) genes were the top variants at QTL on chromosomes 6 and 20. Another known causal missense mutation in the DGAT1 gene (p.A232K, rs109326954, P = 8.4e-1436) was the second top variant at a QTL on chromosome 14 but its allelic substitution effects were inconsistent across breeds. It turned out that the conflicting allelic substitution effects resulted from flaws in the imputed genotypes due to the use of a multi-breed reference population for genotype imputation. Conclusions: Many QTL for milk production traits segregate across breeds and across-breed meta-analysis has greater power to detect such QTL than within-breed association testing. Association testing between imputed sequence variant genotypes and phenotypes of interest facilitates identifying causal mutations provided the accuracy of imputation is high. However, true causal mutations may remain undetected when the imputed sequence variant genotypes contain flaws. It is highly recommended to validate the effect of known causal variants in order to assess the ability to detect true causal mutations in association studies with imputed sequence variants. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.
- Author
-
Tingting Wang, Chen, Yi-Ping Phoebe, MacLeod, Iona M., Pryce, Jennie E., Goddard, Michael E., and Hayes, Ben J.
- Subjects
NUCLEOTIDE sequencing ,EXPECTATION-maximization algorithms ,MARKOV chain Monte Carlo ,CATTLE population genetics ,MILK yield ,CATTLE fertility - Abstract
Background: Using whole genome sequence data might improve genomic prediction accuracy, when compared with high-density SNP arrays, and could lead to identification of casual mutations affecting complex traits. For some traits, the most accurate genomic predictions are achieved with non-linear Bayesian methods. However, as the number of variants and the size of the reference population increase, the computational time required to implement these Bayesian methods (typically with Monte Carlo Markov Chain sampling) becomes unfeasibly long. Results: Here, we applied a new method, HyB_BR (for Hybrid BayesR), which implements a mixture model of normal distributions and hybridizes an Expectation-Maximization (EM) algorithm followed by Markov Chain Monte Carlo (MCMC) sampling, to genomic prediction in a large dairy cattle population with imputed whole genome sequence data. The imputed whole genome sequence data included 994,019 variant genotypes of 16,214 Holstein and Jersey bulls and cows. Traits included fat yield, milk volume, protein kg, fat% and protein% in milk, as well as fertility and heat tolerance. HyB_BR achieved genomic prediction accuracies as high as the full MCMC implementation of BayesR, both for predicting a validation set of Holstein and Jersey bulls (multi-breed prediction) and a validation set of Australian Red bulls (across-breed prediction). HyB_BR had a ten fold reduction in compute time, compared with the MCMC implementation of BayesR (48 hours versus 594 hours). We also demonstrate that in many cases HyB_BR identified sequence variants with a high posterior probability of affecting the milk production or fertility traits that were similar to those identified in BayesR. For heat tolerance, both HyB_BR and BayesR found variants in or close to promising candidate genes associated with this trait and not detected by previous studies. Conclusions: The results demonstrate that HyB_BR is a feasible method for simultaneous genomic prediction and QTL mapping with whole genome sequence in large reference populations. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
12. Multiple-trait QTL mapping and genomic prediction for wool traits in sheep.
- Author
-
Bolormaa, Sunduimijid, Swan, Andrew A., Brown, Daniel J., Hatcher, Sue, Moghaddar, Nasir, van der Werf, Julius H., Goddard, Michael E., and Daetwyler, Hans D.
- Subjects
SHEEP genetics ,GENOMES ,SHEEP breeding - Abstract
Background: The application of genomic selection to sheep breeding could lead to substantial increases in profitability of wool production due to the availability of accurate breeding values from single nucleotide polymorphism (SNP) data. Several key traits determine the value of wool and influence a sheep's susceptibility to fleece rot and fly strike. Our aim was to predict genomic estimated breeding values (GEBV) and to compare three methods of combining information across traits to map polymorphisms that affect these traits. Methods: GEBV for 5726 Merino and Merino crossbred sheep were calculated using BayesR and genomic best linear unbiased prediction (GBLUP) with real and imputed 510,174 SNPs for 22 traits (at yearling and adult ages) including wool production and quality, and breech conformation traits that are associated with susceptibility to fly strike. Accuracies of these GEBV were assessed using fivefold cross-validation. We also devised and compared three approximate multi-trait analyses to map pleiotropic quantitative trait loci (QTL): a multi-trait genome-wide association study and two multi-trait methods that use the output from BayesR analyses. One BayesR method used local GEBV for each trait, while the other used the posterior probabilities that a SNP had an effect on each trait. Results: BayesR and GBLUP resulted in similar average GEBV accuracies across traits (~0.22). BayesR accuracies were highest for wool yield and fibre diameter (>0.40) and lowest for skin quality and dag score (<0.10). Generally, accuracy was higher for traits with larger reference populations and higher heritability. In total, the three multi-trait analyses identified 206 putative QTL, of which 20 were common to the three analyses. The two BayesR multi-trait approaches mapped QTL in a more defined manner than the multi-trait GWAS. We identified genes with known effects on hair growth (i.e. FGF5, STAT3, KRT86, and ALX4) near SNPs with pleiotropic effects on wool traits. Conclusions: The mean accuracy of genomic prediction across wool traits was around 0.22. The three multi-trait analyses identified 206 putative QTL across the ovine genome. Detailed phenotypic information helped to identify likely candidate genes. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
13. Evaluation of the accuracy of imputed sequence variant genotypes and their utility for causal variant detection in cattle.
- Author
-
Pausch, Hubert, MacLeod, Iona M., Fries, Ruedi, Emmerling, Reiner, Bowman, Phil J., Daetwyler, Hans D., and Goddard, Michael E.
- Subjects
CATTLE locomotion ,ANIMAL locomotion ,GENOTYPES ,NUCLEOTIDE sequencing ,HEMOGLOBIN polymorphisms - Abstract
Background: The availability of dense genotypes and whole-genome sequence variants from various sources offers the opportunity to compile large datasets consisting of tens of thousands of individuals with genotypes at millions of polymorphic sites that may enhance the power of genomic analyses. The imputation of missing genotypes ensures that all individuals have genotypes for a shared set of variants. Results: We evaluated the accuracy of imputation from dense genotypes to whole-genome sequence variants in 249 Fleckvieh and 450 Holstein cattle using Minimac and FImpute. The sequence variants of a subset of the animals were reduced to the variants that were included on the Illumina BovineHD genotyping array and subsequently inferred in silico using either within- or multi-breed reference populations. The accuracy of imputation varied considerably across chromosomes and dropped at regions where the bovine genome contains segmental duplications. Depending on the imputation strategy, the correlation between imputed and true genotypes ranged from 0.898 to 0.952. The accuracy of imputation was higher with Minimac than FImpute particularly for variants with a low minor allele frequency. Using a multi-breed reference population increased the accuracy of imputation, particularly when FImpute was used to infer genotypes. When the sequence variants were imputed using Minimac, the true genotypes were more correlated to predicted allele dosages than best-guess genotypes. The computing costs to impute 23,256,743 sequence variants in 6958 animals were ten-fold higher with Minimac than FImpute. Association studies with imputed sequence variants revealed seven quantitative trait loci (QTL) for milk fat percentage. Two causal mutations in the DGAT1 and GHR genes were the most significantly associated variants at two QTL on chromosomes 14 and 20 when Minimac was used to infer genotypes. Conclusions: The population-based imputation of millions of sequence variants in large cohorts is computationally feasible and provides accurate genotypes. However, the accuracy of imputation is low in regions where the genome contains large segmental duplications or the coverage with array-derived single nucleotide polymorphisms is poor. Using a reference population that includes individuals from many breeds increases the accuracy of imputation particularly at low-frequency variants. Considering allele dosages rather than best-guess genotypes as explanatory variables is advantageous to detect causal mutations in association studies with imputed sequence variants. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
14. A hybrid expectation maximisation and MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL mapping.
- Author
-
Wang, Tingting, Chen, Yi-Ping Phoebe, Bowman, Phil J., Goddard, Michael E., and Hayes, Ben J.
- Subjects
GENOMICS ,GENE mapping ,MARKOV chain Monte Carlo ,PREDICTION models ,EXPECTATION-maximization algorithms - Abstract
Background: Bayesian mixture models in which the effects of SNP are assumed to come from normal distributions with different variances are attractive for simultaneous genomic prediction and QTL mapping. These models are usually implemented with Monte Carlo Markov Chain (MCMC) sampling, which requires long compute times with large genomic data sets. Here, we present an efficient approach (termed HyB_BR), which is a hybrid of an Expectation-Maximisation algorithm, followed by a limited number of MCMC without the requirement for burn-in. Results: To test prediction accuracy from HyB_BR, dairy cattle and human disease trait data were used. In the dairy cattle data, there were four quantitative traits (milk volume, protein kg, fat% in milk and fertility) measured in 16,214 cattle from two breeds genotyped for 632,002 SNPs. Validation of genomic predictions was in a subset of cattle either from the reference set or in animals from a third breeds that were not in the reference set. In all cases, HyB_BR gave almost identical accuracies to Bayesian mixture models implemented with full MCMC, however computational time was reduced by up to 1/17 of that required by full MCMC. The SNPs with high posterior probability of a non-zero effect were also very similar between full MCMC and HyB_BR, with several known genes affecting milk production in this category, as well as some novel genes. HyB_BR was also applied to seven human diseases with 4890 individuals genotyped for around 300 K SNPs in a case/control design, from the Welcome Trust Case Control Consortium (WTCCC). In this data set, the results demonstrated again that HyB_BR performed as well as Bayesian mixture models with full MCMC for genomic predictions and genetic architecture inference while reducing the computational time from 45 h with full MCMC to 3 h with HyB_BR. Conclusions: The results for quantitative traits in cattle and disease in humans demonstrate that HyB_BR can perform equally well as Bayesian mixture models implemented with full MCMC in terms of prediction accuracy, but with up to 17 times faster than the full MCMC implementations. The HyB_BR algorithm makes simultaneous genomic prediction, QTL mapping and inference of genetic architecture feasible in large genomic data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
15. Copy number variants in the sheep genome detected using multiple approaches.
- Author
-
Jenkins, Gemma M., Goddard, Michael E., Black, Michael A., Brauning, Rudiger, Auvray, Benoit, Dodds, Ken G., Kijas, James W., Cockett, Noelle, and McEwan, John C.
- Subjects
- *
DNA copy number variations , *SHEEP genetics , *POLYMORPHISM (Zoology) , *NUCLEOTIDE sequencing , *GENOMICS - Abstract
Background: Copy number variants (CNVs) are a type of polymorphism found to underlie phenotypic variation, both in humans and livestock. Most surveys of CNV in livestock have been conducted in the cattle genome, and often utilise only a single approach for the detection of copy number differences. Here we performed a study of CNV in sheep, using multiple methods to identify and characterise copy number changes. Comprehensive information from small pedigrees (trios) was collected using multiple platforms (array CGH, SNP chip and whole genome sequence data), with these data then analysed via multiple approaches to identify and verify CNVs. Results: In total, 3,488 autosomal CNV regions (CNVRs) were identified in this study, which substantially builds on an initial survey of the sheep genome that identified 135 CNVRs. The average length of the identified CNVRs was 19 kb (range of 1 kb to 3.6 Mb), with shorter CNVRs being more frequent than longer CNVRs. The total length of all CNVRs was 67.6Mbps, which equates to 2.7 % of the sheep autosomes. For individuals this value ranged from 0.24 to 0.55 %, and the majority of CNVRs were identified in single animals. Rather than being uniformly distributed throughout the genome, CNVRs tended to be clustered. Application of three independent approaches for CNVR detection facilitated a comparison of validation rates. CNVs identified on the Roche-NimbleGen 2.1M CGH array generally had low validation rates with lower density arrays, while whole genome sequence data had the highest validation rate (>60 %). Conclusions: This study represents the first comprehensive survey of the distribution, prevalence and characteristics of CNVR in sheep. Multiple approaches were used to detect CNV regions and it appears that the best method for verifying CNVR on a large scale involves using a combination of detection methodologies. The characteristics of the 3,488 autosomal CNV regions identified in this study are comparable to other CNV regions reported in the literature and provide a valuable and sizeable addition to the small subset of published sheep CNVs. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
16. Detailed phenotyping identifies genes with pleiotropic effects on body composition.
- Author
-
Bolormaa, Sunduimijid, Hayes, Ben J., van der Werf, Julius H. J., Pethick, David, Goddard, Michael E., and Daetwyler, Hans D.
- Subjects
BODY composition ,HUMAN genetic variation ,GENETIC pleiotropy ,GLYCOGEN synthases ,PHENOTYPES - Abstract
Background: Genetic variation in both the composition and distribution of fat and muscle in the body is important to human health as well as the healthiness and value of meat from cattle and sheep. Here we use detailed phenotyping and a multi-trait approach to identify genes explaining variation in body composition traits. Results: A multi-trait genome wide association analysis of 56 carcass composition traits measured on 10,613 sheep with imputed and real genotypes on 510,174 SNPs was performed. We clustered 71 significant SNPs into five groups based on their pleiotropic effects across the 56 traits. Among these 71 significant SNPs, one group of 11 SNPs affected the fatty acid profile of themuscle and were close to 8 genes involved in fatty acid or triglyceride synthesis. Another group of 23 SNPs had an effect on mature size, based on their pattern of effects across traits, but the genes near this group of SNPs did not share any obvious function. Many of the likely candidate genes near SNPs with significant pleiotropic effects on the 56 traits are involved in intra-cellular signalling pathways. Among the significant SNPs were some with a convincing candidate gene due to the function of the gene (e.g. glycogen synthase affecting glycogen concentration) or because the same gene was associated with similar traits in other species. Conclusions: Using a multi-trait analysis increased the power to detect associations between SNP and body composition traits compared with the single trait analyses. Detailed phenotypic information helped to identify a convincing candidate in some cases as did information from other species. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. Extensive variation between tissues in allele specific expression in an outbred mammal.
- Author
-
Chamberlain, Amanda J., Vander Jagt, Christy J., Hayes, Benjamin J., Khansefid, Majid, Marett, Leah C., Millen, Catriona A., Nguyen, Thuy T. T., and Goddard, Michael E.
- Subjects
GENE expression in mammals ,ALLELES ,CHROMOSOMES ,RNA sequencing ,NUCLEOTIDE sequencing ,GENES - Abstract
Background: Allele specific gene expression (ASE), with the paternal allele more expressed than the maternal allele or vice versa, appears to be a common phenomenon in humans and mice. In other species the extent of ASE is unknown, and even in humans and mice there are several outstanding questions. These include; to what extent is ASE tissue specific? how often does the direction of allele expression imbalance reverse between tissues? how often is only one of the two alleles expressed? is there a genome wide bias towards expression of the paternal or maternal allele; and finally do genes that are nearby on a chromosome share the same direction of ASE? Here we use gene expression data (RNASeq) from 18 tissues from a single cow to investigate each of these questions in turn, and then validate some of these findings in two tissues from 20 cows. Results: Between 40 and 100 million sequence reads were generated per tissue across three replicate samples for each of the eighteen tissues from the single cow (the discovery dataset). A bovine gene expression atlas was created (the first from RNASeq data), and differentially expressed genes in each tissue were identified. To analyse ASE, we had access to unambiguously phased genotypes for all heterozygous variants in the cow's whole genome sequence, where these variants were homozygous in the whole genome sequence of her sire, and as a result we were able to map reads to parental genomes, to determine SNP and genes showing ASE in each tissue. In total 25,251 heterozygous SNP within 7985 genes were tested for ASE in at least one tissue. ASE was pervasive, 89 % of genes tested had significant ASE in at least one tissue. This large proportion of genes displaying ASE was confirmed in the two tissues in a validation dataset. For individual tissues the proportion of genes showing significant ASE varied from as low as 8-16 % of those tested in thymus to as high as 71-82 % of those tested in lung. There were a number of cases where the direction of allele expression imbalance reversed between tissues. For example the gene SPTY2D1 showed almost complete paternal allele expression in kidney and thymus, and almost complete maternal allele expression in the brain caudal lobe and brain cerebellum. Mono allelic expression (MAE) was common, with 1349 of 4856 genes (28 %) tested with more than one heterozygous SNP showing MAE. Across all tissues, 54.17 % of all genes with ASE favoured the paternal allele. Genes that are closely linked on the chromosome were more likely to show higher expression of the same allele (paternal or maternal) than expected by chance. We identified several long runs of neighbouring genes that showed either paternal or maternal ASE, one example was five adjacent genes (GIMAP8, GIMAP7 copy1, GIMAP4, GIMAP7 copy 2 and GIMAP5) that showed almost exclusive paternal expression in brain caudal lobe. Conclusions: Investigating the extent of ASE across 18 bovine tissues in one cow and two tissues in 20 cows demonstrated 1) ASE is pervasive in cattle, 2) the ASE is often MAE but ranges from MAE to slight overexpression of the major allele, 3) the ASE is most often tissue specific and that more than half the time displays divergent allele specific expression patterns across tissues, 4) across all genes there is a slight bias towards expression of the paternal allele and 5) genes expressing the same parental allele are clustered together more than expected by chance, and there are several runs of large numbers of genes expressing the same parental allele. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
18. Impact of QTL properties on the accuracy of multi-breed genomic prediction.
- Author
-
Wientjes, Yvonne C. J., Calus, Mario P. L., Goddard, Michael E., and Hayes, Ben J.
- Subjects
GENOMES ,GENETIC algorithms ,FORECASTING ,GENETICS ,ANIMAL genetics - Abstract
Background: Although simulation studies show that combining multiple breeds in one reference population increases accuracy of genomic prediction, this is not always confirmed in empirical studies. This discrepancy might be due to the assumptions on quantitative trait loci (QTL) properties applied in simulation studies, including number of QTL, spectrum of QTL allele frequencies across breeds, and distribution of allele substitution effects. We investigated the effects of QTL properties and of including a random across- and within-breed animal effect in a genomic best linear unbiased prediction (GBLUP) model on accuracy of multi-breed genomic prediction using genotypes of Holstein-Friesian and Jersey cows. Methods: Genotypes of three classes of variants obtained from whole-genome sequence data, with moderately low, very low or extremely low average minor allele frequencies (MAF), were imputed in 3000 Holstein-Friesian and 3000 Jersey cows that had real high-density genotypes. Phenotypes of traits controlled by QTL with different properties were simulated by sampling 100 or 1000 QTL from one class of variants and their allele substitution effects either randomly from a gamma distribution, or computed such that each QTL explained the same variance, i.e. rare alleles had a large effect. Genomic breeding values for 1000 selection candidates per breed were estimated using GBLUP models including a random across- and a within-breed animal effect. Results: For all three classes of QTL allele frequency spectra, accuracies of genomic prediction were not affected by the addition of 2000 individuals of the other breed to a reference population of the same breed as the selection candidates. Accuracies of both single- and multi-breed genomic prediction decreased as MAF of QTL decreased, especially when rare alleles had a large effect. Accuracies of genomic prediction were similar for the models with and without a random within-breed animal effect, probably because of insufficient power to separate across- and within-breed animal effects. Conclusions: Accuracy of both single- and multi-breed genomic prediction depends on the properties of the QTL that underlie the trait. As QTL MAF decreased, accuracy decreased, especially when rare alleles had a large effect. This demonstrates that QTL properties are key parameters that determine the accuracy of genomic prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
19. A computationally efficient algorithm for genomic prediction using a Bayesian model.
- Author
-
Tingting Wang, Yi-Ping Phoebe Chen, Goddard, Michael E., Meuwissen, Theo H. E., Kemper, Kathryn E., and Hayes, Ben J.
- Subjects
GENETICS ,GENOTYPES ,MARKOV processes ,DAIRY cattle ,BLOOD proteins ,INBREEDING ,CATTLE - Abstract
Background: Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation-maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time. Methods: emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction. Results: We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR. Conclusions: The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
20. Non-additive genetic variation in growth, carcass and fertility traits of beef cattle.
- Author
-
Bolormaa, Sunduimijid, Pryce, Jennie E., Yuandan Zhang, Reverter, Antonio, Barendse, William, Hayes, Ben J., and Goddard, Michael E.
- Subjects
CATTLE genetics ,BEEF cattle ,GENE expression ,SINGLE nucleotide polymorphisms ,EPISTASIS (Genetics) ,CATTLE - Abstract
Background: A better understanding of non-additive variance could lead to increased knowledge on the genetic control and physiology of quantitative traits, and to improved prediction of the genetic value and phenotype of individuals. Genome-wide panels of single nucleotide polymorphisms (SNPs) have been mainly used to map additive effects for quantitative traits, but they can also be used to investigate non-additive effects. We estimated dominance and epistatic effects of SNPs on various traits in beef cattle and the variance explained by dominance, and quantified the increase in accuracy of phenotype prediction by including dominance deviations in its estimation. Methods: Genotype data (729 068 real or imputed SNPs) and phenotypes on up to 16 traits of 10 191 individuals from Bos taurus, Bos indicus and composite breeds were used. A genome-wide association study was performed by fitting the additive and dominance effects of single SNPs. The dominance variance was estimated by fitting a dominance relationship matrix constructed from the 729 068 SNPs. The accuracy of predicted phenotypic values was evaluated by best linear unbiased prediction using the additive and dominance relationship matrices. Epistatic interactions (additive × additive) were tested between each of the 28 SNPs that are known to have additive effects on multiple traits, and each of the other remaining 729 067 SNPs. Results: The number of significant dominance effects was greater than expected by chance and most of them were in the direction that is presumed to increase fitness and in the opposite direction to inbreeding depression. Estimates of dominance variance explained by SNPs varied widely between traits, but had large standard errors. The median dominance variance across the 16 traits was equal to 5% of the phenotypic variance. Including a dominance deviation in the prediction did not significantly increase its accuracy for any of the phenotypes. The number of additive × additive epistatic effects that were statistically significant was greater than expected by chance. Conclusions: Significant dominance and epistatic effects occur for growth, carcass and fertility traits in beef cattle but they are difficult to estimate precisely and including them in phenotype prediction does not increase its accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
21. Improved precision of QTL mapping using a nonlinear Bayesian method in a multi-breed population leads to greater accuracy of across-breed genomic predictions.
- Author
-
Kemper, Kathryn E., Reich, Coralie M., Bowman, Philip J., vander Jagt, Christy J., Chamberlain, Amanda J., Mason, Brett A., Hayes, Benjamin J., and Goddard, Michael E.
- Subjects
DAIRY cattle genetics ,BAYESIAN analysis ,GENOMICS ,SINGLE nucleotide polymorphisms ,DAIRY cattle breeding ,ANIMAL population density ,GENE expression ,CATTLE - Abstract
Background: Genomic selection is increasingly widely practised, particularly in dairy cattle. However, the accuracy of current predictions using GBLUP (genomic best linear unbiased prediction) decays rapidly across generations, and also as selection candidates become less related to the reference population. This is likely caused by the effects of causative mutations being dispersed across many SNPs (single nucleotide polymorphisms) that span large genomic intervals. In this paper, we hypothesise that the use of a nonlinear method (BayesR), combined with a multi-breed (Holstein/Jersey) reference population will map causative mutations with more precision than GBLUP and this, in turn, will increase the accuracy of genomic predictions for selection candidates that are less related to the reference animals. Results: BayesR improved the across-breed prediction accuracy for Australian Red dairy cattle for five milk yield and composition traits by an average of 7% over the GBLUP approach (Australian Red animals were not included in the reference population). Using the multi-breed reference population with BayesR improved accuracy of prediction in Australian Red cattle by 2 - 5% compared to using BayesR with a single breed reference population. Inclusion of 8478 Holstein and 3917 Jersey cows in the reference population improved accuracy of predictions for these breeds by 4 and 5%. However, predictions for Holstein and Jersey cattle were similar using within-breed and multi-breed reference populations. We propose that the improvement in across-breed prediction achieved by BayesR with the multi-breed reference population is due to more precise mapping of quantitative trait loci (QTL), which was demonstrated for several regions. New candidate genes with functional links to milk synthesis were identified using differential gene expression in the mammary gland. Conclusions: QTL detection and genomic prediction are usually considered independently but persistence of genomic prediction accuracies across breeds requires accurate estimation of QTL effects. We show that accuracy of across-breed genomic predictions was higher with BayesR than with GBLUP and that BayesR mapped QTL more precisely. Further improvements of across-breed accuracy of genomic predictions and QTL mapping could be achieved by increasing the size of the reference population, including more breeds, and possibly by exploiting pleiotropic effects to improve mapping efficiency for QTL with small effects. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
22. Identification of genomic regions associated with inbreeding depression in Holstein and Jersey dairy cattle.
- Author
-
Pryce, Jennie E., Haile-Mariam, Mekonnen, Goddard, Michael E., and Hayes, Ben J.
- Subjects
DAIRY cattle breeding research ,HOLSTEIN-Friesian cattle ,JERSEY cattle ,INBREEDING ,SINGLE nucleotide polymorphisms ,ALLELES ,HOMOZYGOSITY - Abstract
Background: Inbreeding reduces the fitness of individuals by increasing the frequency of homozygous deleterious recessive alleles. Some insight into the genetic architecture of fitness, and other complex traits, can be gained by using single nucleotide polymorphism (SNP) data to identify regions of the genome which lead to reduction in performance when identical by descent (IBD). Here, we compared the effect of genome-wide and location-specific homozygosity on fertility and milk production traits in dairy cattle. Methods: Genotype data from more than 43 000 SNPs were available for 8853 Holstein and 4138 Jersey dairy cows that were part of a much larger dataset that had pedigree records (338 696 Holstein and 64 049 Jersey animals). Measures of inbreeding were based on: (1) pedigree data; (2) genotypes to determine the realised proportion of the genome that is IBD; (3) the proportion of the total genome that is homozygous and (4) runs of homozygosity (ROH) which are stretches of the genome that are homozygous. Results: A 1% increase in inbreeding based either on pedigree or genomic data was associated with a decrease in milk, fat and protein yields of around 0.4 to 0.6% of the phenotypic mean, and an increase in calving interval (i.e. a deterioration in fertility) of 0.02 to 0.05% of the phenotypic mean. A genome-wide association study using ROH of more than 50 SNPs revealed genomic regions that resulted in depression of up to 12.5 d and 260 L for calving interval and milk yield, respectively, when completely homozygous. Conclusions: Genomic measures can be used instead of pedigree-based inbreeding to estimate inbreeding depression. Both the diagonal elements of the genomic relationship matrix and the proportion of homozygous SNPs can be used to measure inbreeding. Longer ROH (>3 Mb) were found to be associated with a reduction in milk yield and captured recent inbreeding independently and in addition to overall homozygosity. Inbreeding depression can be reduced by minimizing overall inbreeding but maybe also by avoiding the production of offspring that are homozygous for deleterious alleles at specific genomic regions that are associated with inbreeding depression. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
23. Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.
- Author
-
Wang T, Chen YP, MacLeod IM, Pryce JE, Goddard ME, and Hayes BJ
- Subjects
- Algorithms, Animals, Bayes Theorem, Cattle, Female, Fertility genetics, Genotype, Markov Chains, Milk metabolism, Monte Carlo Method, Phenotype, Polymorphism, Single Nucleotide, Chromosome Mapping, Genomics, Nonlinear Dynamics, Quantitative Trait Loci genetics, Whole Genome Sequencing
- Abstract
Background: Using whole genome sequence data might improve genomic prediction accuracy, when compared with high-density SNP arrays, and could lead to identification of casual mutations affecting complex traits. For some traits, the most accurate genomic predictions are achieved with non-linear Bayesian methods. However, as the number of variants and the size of the reference population increase, the computational time required to implement these Bayesian methods (typically with Monte Carlo Markov Chain sampling) becomes unfeasibly long., Results: Here, we applied a new method, HyB_BR (for Hybrid BayesR), which implements a mixture model of normal distributions and hybridizes an Expectation-Maximization (EM) algorithm followed by Markov Chain Monte Carlo (MCMC) sampling, to genomic prediction in a large dairy cattle population with imputed whole genome sequence data. The imputed whole genome sequence data included 994,019 variant genotypes of 16,214 Holstein and Jersey bulls and cows. Traits included fat yield, milk volume, protein kg, fat% and protein% in milk, as well as fertility and heat tolerance. HyB_BR achieved genomic prediction accuracies as high as the full MCMC implementation of BayesR, both for predicting a validation set of Holstein and Jersey bulls (multi-breed prediction) and a validation set of Australian Red bulls (across-breed prediction). HyB_BR had a ten fold reduction in compute time, compared with the MCMC implementation of BayesR (48 hours versus 594 hours). We also demonstrate that in many cases HyB_BR identified sequence variants with a high posterior probability of affecting the milk production or fertility traits that were similar to those identified in BayesR. For heat tolerance, both HyB_BR and BayesR found variants in or close to promising candidate genes associated with this trait and not detected by previous studies., Conclusions: The results demonstrate that HyB_BR is a feasible method for simultaneous genomic prediction and QTL mapping with whole genome sequence in large reference populations.
- Published
- 2017
- Full Text
- View/download PDF
24. A computationally efficient algorithm for genomic prediction using a Bayesian model.
- Author
-
Wang T, Chen YP, Goddard ME, Meuwissen TH, Kemper KE, and Hayes BJ
- Subjects
- Animals, Bayes Theorem, Cattle, Male, Models, Genetic, Models, Statistical, Polymorphism, Single Nucleotide, Algorithms, Breeding, Genomics methods
- Abstract
Background: Genomic prediction of breeding values from dense single nucleotide polymorphisms (SNP) genotypes is used for livestock and crop breeding, and can also be used to predict disease risk in humans. For some traits, the most accurate genomic predictions are achieved with non-linear estimates of SNP effects from Bayesian methods that treat SNP effects as random effects from a heavy tailed prior distribution. These Bayesian methods are usually implemented via Markov chain Monte Carlo (MCMC) schemes to sample from the posterior distribution of SNP effects, which is computationally expensive. Our aim was to develop an efficient expectation-maximisation algorithm (emBayesR) that gives similar estimates of SNP effects and accuracies of genomic prediction than the MCMC implementation of BayesR (a Bayesian method for genomic prediction), but with greatly reduced computation time., Methods: emBayesR is an approximate EM algorithm that retains the BayesR model assumption with SNP effects sampled from a mixture of normal distributions with increasing variance. emBayesR differs from other proposed non-MCMC implementations of Bayesian methods for genomic prediction in that it estimates the effect of each SNP while allowing for the error associated with estimation of all other SNP effects. emBayesR was compared to BayesR using simulated data, and real dairy cattle data with 632 003 SNPs genotyped, to determine if the MCMC and the expectation-maximisation approaches give similar accuracies of genomic prediction., Results: We were able to demonstrate that allowing for the error associated with estimation of other SNP effects when estimating the effect of each SNP in emBayesR improved the accuracy of genomic prediction over emBayesR without including this error correction, with both simulated and real data. When averaged over nine dairy traits, the accuracy of genomic prediction with emBayesR was only 0.5% lower than that from BayesR. However, emBayesR reduced computing time up to 8-fold compared to BayesR., Conclusions: The emBayesR algorithm described here achieved similar accuracies of genomic prediction to BayesR for a range of simulated and real 630 K dairy SNP data. emBayesR needs less computing time than BayesR, which will allow it to be applied to larger datasets.
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.