1. Multi-variate mixed-models for the normalization of RNA-Seq data: Application to onset of puberty in beef cattle
- Author
-
Tusell, Llibertat, David, Ingrid, Canovas, Angela, Thomas, Milton G, Reverter, Antonio, ProdInra, Migration, Génétique Physiologie et Systèmes d'Elevage (GenPhySE ), École nationale supérieure agronomique de Toulouse [ENSAT]-Institut National de la Recherche Agronomique (INRA)-Ecole Nationale Vétérinaire de Toulouse (ENVT), Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées, University of Guelph, Colorado State University [Fort Collins] (CSU), and Commonwealth Scientific and Industrial Research Organisation [Canberra] (CSIRO)
- Subjects
[SDV.OT]Life Sciences [q-bio]/Other [q-bio.OT] ,beef cattle ,pre and post purberty physiological state ,[SDV.OT] Life Sciences [q-bio]/Other [q-bio.OT] ,multivariate model ,character state model ,RNA-Seq data - Abstract
International audience; Methods based on univariate mixed-models are used to normalize RNA-Seq data as an initial step to detect differentially expressed (DE) genes based on the gene by experimental condition interaction term. Character state models are classically used in quantitative genetics to assess genotype by environment interactions in discrete environments. This approach, considers phenotype measurements in different environments as different traits (or character states). Thus, the interaction variance can be estimated as a function of the genetic variances and covariances of the genetic effects in the environments. In this study, we propose a multi-variate mixed model approach (i.e character state model approach) to normalize RNA-Seq data to detect DE genes as well as potential interaction variance between the genes and (i) pre and post- puberty periods and (ii) several tissues (i.e., muscle, fat, liver, uterus, ovary, pituitary gland and hypothalamus) in composite beef cattle. A total of 1,087,752 base-2 log-transformation Reads Per Kilobase of transcript per Million mapped reads (RPKM) were analyzed in the pre and post- puberty physiological states as 2 different traits in a bivariate model. The model includes the systematic effect of library (61 levels) and the random effects of gene (17,832 genes) and gene × tissue (142,656 levels). The bivariate model allowed detecting DE genes in a similar way than the univariate did (98% of DE genes in common). The interaction variance between the genes and the puberty physiological states was small (0.02) because the estimated correlation of the genes was close to unity (0.98) and the gene variances in the 2 physiological states were of similar magnitude (6.78 and 6.76 in pre and post-puberty environments, respectively). Further research is warranted to assess the optimality of a multi-variate mixed model to evaluate the interaction variance across 8 different tissues. In this second model, the log-transformed RPKM reads measured in the tissues will be considered to be different traits, while the differential expression of interest will still be between pre and post-puberty physiological states.
- Published
- 2019