1. A comparative study of machine learning models for respiration rate prediction in dairy cows: Exploring algorithms, feature engineering, and model interpretation.
- Author
-
Yan, Geqi, Zhao, Wanying, Wang, Chaoyuan, Shi, Zhengxiang, Li, Hao, Yu, Zhenwei, Jiao, Hongchao, and Lin, Hai
- Subjects
- *
MACHINE learning , *DAIRY cattle , *MILK quality , *STANDARD deviations , *FEATURE selection , *RESPIRATION , *NATURAL ventilation - Abstract
The respiration rate (RR) of dairy cows is a crucial welfare indicator for assessing heat stress in cows exposed to high temperatures. Machine learning (ML) models can automatically identify patterns from factors related to cow RR. This study utilised ML methods to establish a predictive model for cow RR using easily accessible variables in real production settings. A comparison of 20 ML algorithms, including linear regression, neural networks, and others, was conducted to evaluate their performance in predicting cow RR and investigate the impact of different inputs and feature engineering techniques on algorithm performance, using a cleaned dataset comprising 2977 records. The main findings indicate that the CATBOOST-based model, specifically the CATBOOST algorithm with environmental parameters as input features under ordinal encoding, exhibited the best performance, with a coefficient of determination (R2) of 0.676, a mean absolute error (MAE) of 7.246, a mean absolute percentage error (MAPE) of 13.8%, and a root mean square error (RMSE) of 9.341. There was no statistical difference in model performance using environmental parameters, heat indices, or heat flows as input features. Feature polynomials, PCA-based dimensionality reduction, and filter-based feature selection may significantly reduce the performance of the ML-based cow RR model. Additionally, according to the SHAP analysis of the optimal model, air temperature, black globe temperature, and airflow speed are identified as the top three factors contributing to the prediction of cow RR. The findings from this study can offer valuable guidance for the design and regulation of dairy farm environmental control systems. • 20 ML algorithms under 24 feature engineering schemes were evaluated. • The CATBOOST-based model outperformed other models. • SHAP analysis was used to explain the optimal model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF