Back to Search Start Over

Machine learning techniques and interpretability for maize yield estimation using Time-Series images of MODIS and Multi-Source data.

Authors :
Lyu, Yujiao
Wang, Pengxin
Bai, Xueyuan
Li, Xuecao
Ye, Xin
Hu, Yuchen
Zhang, Jie
Source :
Computers & Electronics in Agriculture. Jul2024, Vol. 222, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

• A new Regional Geo-Statistics (RGS) Method is proposed to better extract remote sensing features for yield estimation. • Adding the surface reflectance bands to the vegetation index as features can improve the accuracy of yield estimation. • The multi-source data can realize the best yield estimation results. • LightGBM is capable of yield prediction on remote sensing data. • Quantitative analysis of the contribution of multi-source data. Timely and accurate estimation of maize yield is crucial for ensuring food security. This study integrated multi-source data (satellite, meteorological, and soil data) on Google Earth Engine (GEE) and used machine learning techniques to estimate summer maize yield across 469 counties in the Huang-Huai-Hai Plain of China from 2010 to 2020. A novel method for extracting features from remote sensing images was proposed called the Regional Geo-Statistics (RGS) Method. We compared its performance with the traditional county-level averages method and explained the impact and contributions of incorporating multi-source data on yield estimation models. Firstly, Enhanced Vegetation Index (EVI) and Near-Infrared Vegetation Reflectance (NIRv) were transformed into regional geostatistical vectors and county-level averages using GEE. Then, yield was estimated using LightGBM, RF and LASSO. The results highlighted the superiority of the RGS method, with LightGBM yielding the best model (R2 = 0.55, RMSE = 852.92 kg/ha, NRMSE = 13.66 %). Further improvements were achieved by gradually adding meteorological and soil variables, with R2 improvements of 0.03 and 0.20, respectively. The Comprehensive Factor Model (CFM), utilizing all data as features, achieved the best results (R2 = 0.76, RMSE = 629.33 kg/ha, NRMSE = 10.08 %). The interpretability analysis based on CFM model underscored the significance of soil-related variables, which contributed significantly (45.38 %) alongside remote sensing variables, emphasizing the crucial role of soil variables in maize yield estimation. This study presents a versatile and reliable method for integrating multi-source data for maize yield estimation, supporting agricultural management and food security. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01681699
Volume :
222
Database :
Academic Search Index
Journal :
Computers & Electronics in Agriculture
Publication Type :
Academic Journal
Accession number :
177880360
Full Text :
https://doi.org/10.1016/j.compag.2024.109063