Back to Search
Start Over
Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data
- Source :
- Frontiers in Genetics, Repositório Institucional da USP (Biblioteca Digital da Produção Intelectual), Universidade de São Paulo (USP), instacron:USP, Frontiers in Genetics, Vol 12 (2021)
- Publication Year :
- 2021
- Publisher :
- Frontiers Media S.A., 2021.
-
Abstract
- Machine learning (ML) methods have shown promising results in identifying genes when applied to large transcriptome datasets. However, no attempt has been made to compare the performance of combining different ML methods together in the prediction of high feed efficiency (HFE) and low feed efficiency (LFE) animals. In this study, using RNA sequencing data of five tissues (adrenal gland, hypothalamus, liver, skeletal muscle, and pituitary) from nine HFE and nine LFE Nellore bulls, we evaluated the prediction accuracies of five analytical methods in classifying FE animals. These included two conventional methods for differential gene expression (DGE) analysis (t-test and edgeR) as benchmarks, and three ML methods: Random Forests (RFs), Extreme Gradient Boosting (XGBoost), and combination of both RF and XGBoost (RX). Utility of a subset of candidate genes selected from each method for classification of FE animals was assessed by support vector machine (SVM). Among all methods, the smallest subsets of genes (117) identified by RX outperformed those chosen by t-test, edgeR, RF, or XGBoost in classification accuracy of animals. Gene co-expression network analysis confirmed the interactivity existing among these genes and their relevance within the network related to their prediction ranking based on ML. The results demonstrate a great potential for applying a combination of ML methods to large transcriptome datasets to identify biologically important genes for accurately classifying FE animals.
- Subjects :
- 0301 basic medicine
Candidate gene
lcsh:QH426-470
RNA-Seq
Bos indicus
Biology
Extreme Gradient Boosting
Machine learning
computer.software_genre
Feed conversion ratio
Transcriptome
03 medical and health sciences
Genetics
Gene
Genetics (clinical)
Original Research
supporting vector machine
co-expression network
Random Forest
business.industry
0402 animal and dairy science
04 agricultural and veterinary sciences
040201 dairy & animal science
Random forest
Support vector machine
lcsh:Genetics
030104 developmental biology
residual feed intake
Molecular Medicine
SEQUENCIAMENTO GENÉTICO
Artificial intelligence
Residual feed intake
RNA-seq
business
computer
Subjects
Details
- Language :
- English
- ISSN :
- 16648021
- Volume :
- 12
- Database :
- OpenAIRE
- Journal :
- Frontiers in Genetics
- Accession number :
- edsair.doi.dedup.....031c7d85e1035147cdcd8394abd858b9